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COLLAGEN-BINDING PROTEINS FROM 
STREPTOCOCCUS PYOGENES 



Field of the Invention 

The present invention relates in general to proteins from group A Streptococci 
(GAS) that can bind collagen, and in particular to collagen-binding proteins designated 
Cpa1 and Cpa49, and the nucleic acid sequences coding for those proteins, which 
have been isolated from Streptococcus pyogenes and which can be used in methods to 
inhibit collagen binding and thus treat or prevent infectious diseases caused by group 
A Streptococcus bacteria. 

Background of the Invention 

The Streptococci bacteria are a pathogenic genera of microorganisms which 
have been associated with a wide variety of infectious disorders including suppuration, 
abscess formation, a variety of pyogenic infections, and septicemia. In particular, 
Streptococcus pyogenes (a group A streptococci, or GAS) is a prominent pathogen 
which causes skin and mucous membrane infections, as well as deep-seated 
connective tissue infections and severe, sometimes fatal, septicemia. Like many other 
pathogens, in order to infect the human host successfully, GAS must have the ability to 
adjust the expression of its virulence factors according to the varying conditions of 
different anatomical sites. 



In GAS, the expression of several virulence factors is positively regulated at the 
level of transcription by the Mga regulator. See Perez-Casal et al. (1991); Chen et al., 
1993; Podbielski et al. (1995) and (1996). Regulated genes include M and M-related 
proteins (phagocytosis resistance, eukaryotic cell interactions), fibronectin-related 
proteins (serum opacity factor), Spep (protease) and c5a peptidase (inactivation of 
complement factor c5a). Recent evidence has demonstrated that, in addition to iron 
levels, pH, CO* and temperature (see Caparon et al., 1992; Podbielski et al., 1992; 
Okada et al., 1993; Mclver et al., 1995) and activity of the Mga regulator is associated 
with logarithmic and late logarithmic growth phase. See Mclver et al. (1 997). 

Another regulator in Streptococcus is RofA, a positive transcriptional regulator of 
the fibronectin-binding protein (prtF) (see Fogg et al., 1994 and 1997) that promote 
bacterial attachment to the host extracellular matrix (see Hanski et al., 1992; and Van 
Heyningen et al., 1993). In contrast to Mga-controlled genes, RofA positively regulates 
prtF transcription as well as its own transcription in response to increased levels of 0 2 . 
By a potentially independent mechanism, transcription of prtF is also induced in 
response to intracellular superoxide levels (see Gibson et al., 1996). 

These data have suggested differential expression of eukaryotic cell-binding 
proteins such as Rof A-dependent prtF and Mga-dependent emm in response to 0 2 and 
COz partial pressures. These observations have led to the proposal that these 
regulators may influence the expression of proteins important for the attachment of 
GAS in different in vivo environments such as superficial Langerhans cells or 
subsurface keratinocytes (Okada et al. 1994; 1995). As has been observed with regard 
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to other bacterial species, the attachment of bacteria to host cells or implanted 
biomaterials is generally initiated through "extracellular matrix proteins," or ECM's, 
which generally refer to such general families of macromolecules, collagens, structural 
glycoproteins, proteoglycans and elastins, including fibronectin, and fibrinogen, that 
provide support and modulate cellular behavior. However, the precise role of the 
bacteria's ability to bind to these extracellular matrix proteins and the knowledge of how 
to best utilize this information in order to prevent streptococcal infection has not yet 
been fully determined. 

Moreover, outside of the two regulators RofA and Mga, very little is known with 
regard to environmentally dependent virulence gene expression in GAS, and thus there 
has been very limited information with regard to the regulation and inhibition of the 
extracellular matrix proteins that are responsible for the attachment and infection 
caused by GAS. In light of the extremely severe nature of the bacterial infections 
caused by the Streptococcal bacteria, it is extremely important to make a determination 
of which specific proteins are responsible for attachment to the surface of targeted 
cells, and to be able to use this information in order to develop vaccines and other 
biological agents which can be used to treat or prevention the severe infections 
associated with group A streptococci. 

Summary of the Invention 

Accordingly, it is an object of the present invention to provide isolated proteins 
(adhesins) from group A streptococci which can bind to intercellular matrix proteins 



such as collagen so as to be useful in developing methods of Inhibiting collagen 
binding and attachment of streptococcal bacteria to cells. 

It is a further object of the present invention to provide isolated streptococcal 
surface proteins that are able to inhibit adhesion to the immobilized extracellular matrix 
or host cells present on the surface of implanted biomaterials. 

It is a further object of the present invention to provide a vaccine which can be 
used in treating or preventing infection by group A streptococcal bacterial such as 
Streptococcus pyogenes. 

It is still further an object of the present invention to generate antisera and 
antibodies to the collagen binding proteins from GAS which can also be useful in 
developing methods of treatment which can inhibit binding of the streptococcal bacteria 
to host cells or to implanted biomaterials and thus be employed in order to treat or 
prevent Streptococcal infection. 

It is a further object of the present invention to provide improved materials and 
methods for detecting and differentiating collagen-binding proteins in streptococcal 
organisms in clinical and laboratory settings. 

It is a further object of the invention to provide nucleic acid sequences which 
code for the collagen binding proteins in GAS which can also be useful in producing 
the collagen-binding proteins of the invention and in developing probes and primers 
specific for identifying and characterizing these proteins. 

These and other objects are provided by virtue of the present invention which 
comprises isolated collagen binding proteins from group A streptococcal bacteria such 
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as Streptococcus pyogenes along with their amino acid and nucleic acid sequences. 
Two of the specific proteins isolated in accordance with the invention are designated 
Cpa1 and Cpa49 which are obtained from the collagen binding region in Streptococcus 
pyogenes, and the sequences for these proteins are those as shown in SEQ ID NOS. 2 
and 4, respectively. The nucleic acid sequences coding for Cpa1 and Cpa49 are 
shown in SEQ ID NOS. 1 and 3, respectively. The isolated proteins of the present 
invention have been observed to bind to collagen, and thus can be utilized in methods 
of treating or preventing streptococcal infection through the inhibition of the ability of 
the bacteria to bind to collagen. 

In another aspect of the present invention, there is also provided antisera and 
antibodies generated against the collagen binding proteins of the present invention 
which also can be utilized in methods of treatment which involve inhibition of the 
attachment of the Cpa proteins to collagen. In particular, specific polyclonal antiserum 
against Cpa has been generated which has been shown to react with Cpa in Western 
immunoblots and ELISA assays and which interferes with Cpa binding to collagen. 
This antiserum can thus be used for specific agglutination assays to detect bacteria 
which express Cpa on their surface. The antiserum apparently does not cross-react 
with bacteria which express the fibronectin-binding protein F1 on their surface despite 
the fact that a portion of protein F1 exhibits sequence homologies to Cpa1 and Cpa49. 

Accordingly, in accordance with the invention, antisera and antibodies raised 
against the Cpa1 and Cpa49 proteins, or portions thereof, may be employed in 
vaccines, and other pharmaceutical compositions containing the proteins for 
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therapeutic purposes are also provided herein. In addition, diagnostic kits containing 
the appropriate nucleic acid molecules, the Cpa1 or Cpa49 proteins, or antibodies or 
antisera raised against them are also provided so as to detect bacteria expressing 
these proteins. 

These embodiments and other alternatives and modifications within the spirit 
and scope of the disclosed invention will become readily apparent to those skilled in 
the art from reading the present specification and/or the references cited herein. 

Bref Description of the Drawing Figures 

Figure 1 is a schematic representation of a comparison of the nra/rofA- 
associated portions of group A streptococcal serotype M1, M6 and M49 strains. 
Results of pairwise comparisons of the deduced amino acid sequences of single ORF's 
are shown as percentage identity values between corresponding sequences. 
Sequence alignments were centered at the nra/tofA to prtF/bpa intergenic regions. All 
sequences are shown to scale. For designation of ORF's, see Table 1 hereinbelow. 
The M1 sequence was obtained from the GAS sequencing project (Roe et al., 1997), 
and the M6 sequence was taken from Hanski et al. (1992) and Fogg et al. (1994). The 
inserted box contains the comparison of the deduced Nra and RofA amino acid 
sequences. "." marks identical amino acid positions; marks gaps that were 
introduced into the RofA sequence to maximize alignment. The underlined sequence 
marks the potential helix-tum-helix identified by Fogg et al. (1997). 
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Figure 2 depicts transcript analysis of nra and iwa-regulated genes in a CAS 
wild-type (wt) and nra mutant (nra) strain. Total RNA was isolated from late log phase 
cells grown under anaerobic (aer.) and anaerobic (anaer.) conditions. Unless 
otherwise indicated, 20 ug of total RNA was used per lane for Northern blotting. PCR- 
amplified and digoxigenin-labelled probes specific for nm, cpa, nifR3L and prtF (Table 
4) were used for hybridization. Northern analyses represent the results of transcription 
analysis of (1) the nra gene as shown in Figure 2A, (2) operons adjacent to the nra 
gene as shown in Figure 2B, and (3) the prtF gene, which is located at an unknown 
distance from nra, as shown in Figure 2C. In all cases, an increase in band intensity 
was observed using total RNA isolated from the nra mutant. With the exception of cpa, 
this increase was particularly pronounced in RNA prepared from anaerobically grown 
cultures. The nra message in the wild-type strain was expressed at very low and 
sometimes undetectable levels. 

Figure 3 depicts transcript analysis of the positive global mga regulator gene in 
GAS wild-type (wt) and nra mutant strains, and the transcript analysis of nra, nifR3L 
and cpa in GAS wild-type (wt) and mga mutant strains. Total RNA was prepared from 
mid-log phase cells grown under anaerobic conditions and was subjected to Northern 
blot hybridization using the indicated RNA amounts per lane. PCR-amplified and 
digoxigenin-labelled probes specific for mga and nra (left) or nifR3L and cpa (right) 
were used for hybridization and subsequent CSPD visualization. 

Figure 4 is a diagram of transcription and control of nra and nra-regulated 
genes. Nra exhibits negative regulation (-) of its own expression, that of two adjacent 
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operons and of the mga regulator gene. Mga is a positive regulator (+) of its own 
expression and that of nra. Promoters (p) and transcription terminators (ff) are shown 
in italics. For designation of ORFs, see Table 1 . The sequences are drawn to scale. 

Figure 5 depicts attachment of Gas wild-type and nra mutant strains to 
immobilized human fibronectin and type I collagen. The bacteria were cultured on solid 
THY medium under anaerobic conditions until they reached stationary phase and were 
then harvested for binding assays. After FTIC labeling of the cells, adherent cells were 
detected by measuring the relative light units (RLU) present in each sample. 
Normalization of the values was performed as indicated below in the Examples section. 

Detailed Description of the Preferred Embodiments 

In accordance with the present invention, there is provided isolated collagen 
binding proteins from group A streptococcal bacteria, and their corresponding amino 
acid and nucleic acid sequences are described herein. Two specific proteins isolated in 
accordance with the present invention are designated Cpa1, having the nucleic acid 
sequence as shown in SEQ ID NO. 1 and the amino acid sequence of SEQ ID NO. 2, 
and Cpa49, which has the nucleic acid sequence as shown in SEQ ID NO. 3 and the 
amino acid sequence observed in SEQ ID No. 4. Using different experimental 
approaches, it has now been shown that Cpa1 and Cpa49 both bind to collagen, e.g., 
via binding of soluble 125-iodine labeled collagen, inhibition of binding to immobilized 
collagen by recombinant purified Cpa1 protein and by specific antisera directed to 
Cpa49 / Cpa1, and thus these proteins or their antibodies can thus be useful in the 
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treatment and prevention of group A streptococcal disease, or in techniques to identify 
such proteins, as described further below. It has also been determined via collagen 
binding experiments with recombinant purified Cp&tfagments, that the collagen binding 
domain can be deduced to reside in the third (C-terminal) quarter of the protein. 

In addition to the structures of Cpa1 and Cpa49 as shown in the amino acid 
sequences of SEQ ID NOS. 2 and 4, respectively, as would be recognized by one of 
ordinary skill in this art, modification and changes may be made in the structure of the 
peptides of the present invention and DNA segments which encode them and still 
obtain a functional molecule that encodes a protein or peptide with desirable 
characteristics. The amino acid changes may be achieved by changing the codons of 
the DNA sequence. For example, certain amino acids may be substituted for other 
amino acids in a protein structure without appreciable loss of interactive binding 
capacity with structures such as, for example, antigen-binding regions of antibodies or 
binding sites on substrate molecules. Since it is the interactive capacity and nature of 
a protein that defines that protein's biological functional activity, certain amino acid 
sequence substitutions can be made in a protein sequence, and, of course, its 
underlying DNA coding sequence, and nevertheless obtain a protein with like 
properties. It is thus contemplated by the inventors that various changes may be made 
in the peptide sequences of the disclosed compositions, or corresponding DNA 
sequences which encode said peptides without appreciable loss of their biological 
utility or activity. 
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In addition, amino acid substitutions are also possible without affecting the 
collagen binding ability of the isolated proteins of the invention, provided that the 
substitutions provide amino acids having sufficiently similar properties to the ones in 
the original sequences. 

Accordingly, acceptable amino acid substitutions are generally therefore based 
on the relative similarity of the amino acid side-chain substituents, for example, their 
hydrophobicity, hydrophilicity, charge, size, and the like. Exemplary substitutions which 
take various of the foregoing characteristics into consideration are well known to those 
of skill in the art and include: arginine and lysine; glutamate and aspartate; serine and 
threonine; glutamine and asparagine; and valine, leucine and isoleucine. The isolated 
proteins of the present invention can be prepared in a number of suitable ways known 
in the art including typical chemical synthesis processes to prepare a sequence of 
polypeptides. 

The synthetic polypeptides of the invention can thus be prepared using the well 
known techniques of solid phase, liquid phase, or peptide condensation techniques, or 
any combination thereof, can include natural and unnatural amino acids. Amino acids 
used for peptide synthesis may be standard Boc (N'-amino protected 
N'-t-butyloxycarbonyl) amino acid resin with the standard deprotecting, neutralization, 
coupling and wash protocols of the original solid phase procedure of Merrifield (J. Am. 
Chem. Soc., 86:2149-2164, 1963), or the base-labile N'-amino protected 
94luorenylmethoxycarbonyl (Fmoc) amino acids first described by Carpino and Han (J. 
Org. Chem., 37:3403-3409, 1972). Both Fmoc and Boc N*-amino protected amino 
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acids can be obtained from Fluka, Bachem, Advanced Chemtech, Sigma, Cambridge 
Research Biochemical, Bachem, or Peninsula Labs or other chemical companies 
familiar to those who practice this art. In addition, the method of the invention can be 
used with other N"-protecting groups that are familiar to those skilled in this art. Solid 
phase peptide synthesis may be accomplished by techniques familiar to those in the art 
and provided, for example, in Stewart and Young, 1984, Solid Phase Synthesis, 
Second Edition, Pierce Chemical Co., Rockford, IL; Fields and Noble, 1990, Int. J. Pept 
Protein Res. 35:161-214, or using automated synthesizers, such as sold by ABS. 
Thus, polypeptides of the invention may comprise D-amino acids, a combination of D- 
and L-amino acids, and various "designer" amino acids (e.g., p-methyl amino acids, 
Ca-methyl amino acids, and Na-methyl amino acids, etc.) to convey special properties. 
Synthetic amino acids include ornithine for lysine, fluorophenylalanine for 
phenylalanine, and norieucine for leucine or isoleucine. Additionally, by assigning 
specific amino acids at specific coupling steps, a-helices, p turns, 0 sheets, y-turns, 
and cyclic peptides can be generated. 

In a further embodiment, subunits of peptides that confer useful chemical and 
structural properties will be chosen. For example, peptides comprising D-amino acids 
will be resistant to L-amino acid-specific proteases in vivo. In addition, the present 
invention envisions preparing peptides that have more well defined structural 
properties, and the use of peptidomimetics and peptidomimetic bonds, such as ester 
bonds, to prepare peptides with novel properties. In another embodiment, a peptide 
may be generated that incorporates a reduced peptide bond, i.e., R1-CH2-NH-R2, where 
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R, end Ra are amino add residues or sequences. A reduced peptide bond may be 
introduced as a dipeptide subunit. Such a molecule would be resistant to peptide bond 
hydrolysis, e.g., protease activity. Such peptides would provide ligands with unique 
function and activity, such as extended half-lives in vivo due to resistance to metabolic 
breakdown or protease activity. It is also well known that in certain systems, 
constrained peptides show enhanced functional activity (Hruby, Life Sciences, 
31:189-199, 1982); (Hruby et at, Biochem J., 268:249-262, 1990). 

Also provided herein are sequences of nucleic acid molecules that selectively 
hybridize with nucleic acid molecules encoding the collagen-binding proteins of the 
invention, or portions thereof, such as consensus or variable sequence amino acid 
motifs, from Streptococcus pyogenes described herein or complementary sequences 
thereof. By "selective* or "selectively" is meant a sequence which does not hybridize 
with other nucleic acids. This is to promote specific detection of Cpa1 or Cpa49. 
Therefore, in the design of hybridizing nucleic acids, selectivity will depend upon the 
other components present in a sample. The hybridizing nucleic acid should have at 
least 70% complementarity with the segment of the nucleic acid to which it hybridizes. 
As used herein to describe nucleic acids, the term "selectively hybridizes" excludes the 
occasional randomly hybridizing nucleic acids, and thus, has the same meaning as 
"specifically hybridizing*. The selectively hybridizing nucleic acids of the invention can 
have at least 70%, 80%, 85%, 90%, 95%, 97%, 98%, and 99% complementarity with 
the segment of the sequence to which they hybridize. 
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The invention contemplates sequences, probes and primers which selectively 
hybridize to the encoding DNA or the complementary, or opposite, strand of DNA as 
those specifically provided herein. Specific hybridization with nucleic acid can occur 
with minor modifications or substitutions in the nucleic acid, so long as functional 
species-specific hybridization capability is maintained. By "probe" is meant nucleic 
acid sequences that can be used as probes or primers for selective hybridization with 
complementary nucleic acid sequences for their detection or amplification, which 
probes can vary in length from about 5 to 100 nucleotides, or preferably from about 10 
to 50 nucleotides, or most preferably about 18-24 nucleotides. Therefore, the terms 
"probe" or "probes" as used herein are defined to include "primers*. Isolated nucleic 
acids are provided herein that selectively hybridize with the species-specific nucleic 
acids under stringent conditions and should have at least 5 nucleotides complementary 
to the sequence of interest as described by Sambrook et a/., 1989. Molecular 
cloning: a laboratory manual, 2nd ed. Cold Spring Harbor Laboratory, Cold Spring 
Harbor, N.Y. 

If used as primers, the composition preferably includes at least two nucleic acid 
molecules which hybridize to different regions of the target molecule so as to amplify a 
desired region. Depending on the length of the probe or primer, the target region can 
range between 70% complementary bases and full complementarity and still hybridize 
under stringent conditions. For example, for the purpose of diagnosing the presence of 
the S. pyogenes, the degree of complementarity between the hybridizing nucleic acid 
(probe or primer) and the sequence to which it hybridizes (e.g., group A streptococcal 
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DNA from a sample) is at least enough to distinguish hybridization with a nucleic acid 
from other bacteria. 

The nucleic acid sequences encoding Cpa1 or Cpa49 proteins or portions 
thereof, such as consensus or variable sequence amino acid motifs, can be inserted 
into a vector, such as a plasmid, and recombinantly expressed in a living organism to 
produce recombinant Cpa1 or Cpa49 proteins or active fragments thereof. 

Recombinant proteins are produced by methods well known to those skilled in 
the art A cloning vector, such as a plasmid or phage DNA is cleaved with a restriction 
enzyme, and the DNA sequence encoding the Cpa1 or Cpa49 protein or active 
fragments thereof, such as consensus or variable sequence amino acid motifs, is 
inserted into the cleavage site and ligated. The cloning vector is then inserted into a 
host to produce the protein or fragment encoded by the Cpa1 or Cpa49 encoding DNA. 
Suitable hosts include bacterial hosts such as Escherichia coli, Bacillus subtilis, yeasts 
and other cell cultures. Production and purification of the gene product may be 
achieved and enhanced using known molecular biology techniques. 

In accordance with the present invention, we have sequenced an 11.5 kb 
genomic fragment of serotype M49 GAS strain CS101 harboring the nra gene that is 
63% homologous to the rofA positive regulatory gene. In contrast to the apparent 
function of rofA, nra was found to encode a negative regulator affecting its own 
expression, the expression of two adjacent operons and several other genes. Some of 
these genes encode potentional intracellular proteins, whereas others encode surface 
proteins such as the collagen-binding CPA (this study) and the fibronectin-binding 
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PrtF2 (Jaffe et al., 1996), which may be involved in virulence. In addition, nra 
influences the expression of the mga regulatory gene and, thereby, the factors 
contained in the mga region. Expression of nra was found to be maximal in early 
stationary phase and was not significantly influenced by atmospheric conditions. 
Overall, the present invention includes the identification of a unique GAS negative 
regulator and implicates its function in a regulatory network affecting virulence factor 
expression in GAS, as set forth in detail in Podbielski et al., Molecular Microbiol. 
31 (4): 1051 -1064 (1999), incorporated herein by reference. 

In accordance with the present invention, an analysis was undertaken of the 
genomic region containing the nra gene. In this analysis, an 11 489 bp portion of the 
GAS chromosome was sequenced from a Lambda library of the serotype M49 GAS 
genome (GenBank accession no. U 49397). Computer analysis of this sequence 
revealed the present of nine complete and two partial predicted open reading frames 
(ORFs) (Fig. 1). Homology comparisons with GenBank entries demonstrated the 
similarity of 10 of the ORFs to known bacterial protein sequences (Table 1). Detailed 
analysis of the gene products encoded in this region (see the following sections) 
revealed the presence of a negative regulatory gene, nra, and immediately upstream in 
the opposite orientation, a collagen-binding protein, cpa. The genomes of GAS 
serotypes in GenBank and the available streptococcal serotype M1 genomic 
sequences (Roe et al., 1997) were searched for homologues to nra and cpa (Fig. 1). 
The gene sharing the highest degree of homology with nra was the positive regulatory 
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factor, rofA, while cpa showed the highest homology to a gene for a fibronectin-binding 
protein, prtF. 

A more detailed computer analysis of the similarity between the negative 
regulator nra and the positive regulator rofA showed that both contain similar N- 
terminal double helix-tunvhelix motifs (Fig. 1) whose intramolecular localization would 
be consistent with a negative or dual regulatory function of the proteins (Prag et al., 
1997). Homology between the collagen-binding cpa genes and the fibronectin-binding 
prtF genes was confined to the N-terminal sections and did not include the portions of 
prtF encoding its two fibronectin binding domains (Taley et al., 1 994; Ozeri et al., 1996; 
Sela et al., 1993). The genes of fibronectin-binding proteins F have at least two 
isotypes, prtF (Hanski and Caparon, 1992) and sfb (Talay et al., 1992), which exhibit 
52% sequence homology. Similarly the genes of collagen-binding proteins, cpa, also 
appeared to have multiple forms such as cpa in M49 and cpa.f in M1, which shared 
approximately 53% homology to each other and 23% homology to the prtF family of 
proteins. 

In order to confirm and extend the results of the sequence comparisons, 
oligonucleotides specific for prtF (Natanson et al., 1995), prtF2, cpa (M49/M1 ), nra and 
rofA genes (Table 4) were synthesized. These oligonucleotides were used as 
polymerase chain reaction (PCR) primers on genomic DNA from serotypes M1 , M2, M3, 
M4, M5, M6, M12, M18, M24 (Table 2) and eight independent M49 strains. In addition, 
the primers were used to generate probes for Southern blot hybridizations that were 
performed with EcoRI- and H/ncM-digested genomic DNA of the 10 serotype strains 

16 



(Table 2). Based on the results from both analyses, no variation was found within the 
M49 serotype. However, different M protein serotype strains harbored either rofA, nra 
or both genes. Any combination of regulator and binding protein (cpa, prtF, prtF2) 
could also be found. Therefore, the nra/cpa and rofA/prtF pairs are not mutually 
exclusive, and single strains can also contain any combination of regulators and 
binding proteins. What was particularly striking was that, although M49- and M1- 
contained gene pairs had different regulatory proteins (cpa/nra and cpa. 1/rofA. 1 
respectively), the binding and regulatory genes were flanked by five genes sharing 
>98% homology and three genes with <50% homology that indicated that cpa and nra 
could be part of a pathogenicity island. In the serotype M49 strain used for further 
study, in addition to the cpa/nra gene pair, a prtF2 gene was contained in a separate 
location on the GAS chromosome. The localization of other regulator/binding protein 
pairs, especially in strains containing multiple regulators or binding proteins, awaits 
further analysis. 

The transcriptional organization of nra, cpa and flanking genes was determined 
by Northern blotting using PCR-generated specific probes (see Table 4 for primer 
sequences). Each Northern blot was repeated three or four times, and the results are 
given in Fig. 2. To determine the effect of nra on the transcription of itself and 
neighboring genes, an nra mutant was constructed by genomic insertion of the plasmid 
pFW11. The construct was confirmed by Southern blot hybridization and specific 
PCRs using nra mutant genomic DNA (data not shown). As transcription of rofA, the 
gene sharing the greatest homology to nra, is increased under aerobic conditions, the 
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Northern analyses were carried out on RNA isolated from cells grown under both 
aerobic and anaerobic conditions. It should be noted that nra was transcribed at very 
low rates and was barely detectable in 80 \ig of total RNA. 

The nra region was found to be monocistronically transcribed («1.8kb) and 
upregulated in an nra mutant. Transcription was slightly, although probably not 
significantly, induced under aerobic conditions (Fig. 2A). The three genes immediately 
downstream of nra, ORF5-nifR3L-kinL, were transcribed as an operon whose 2.6 kb 
transcript, as detected with a nifR3L probe, is shown in Fig. 2B. The ORF6-W/7L operon 
was expressed at higher levels under aerobic conditions and in an nra mutant, 
suggesting that this operon falls under the control of nra. The different transcription 
rates of nifR3L in wild-type and nra mutant strains were confirmed by Northern blots 
performed on serial dilutions of total mRNA (Fig. 2B). Reverse transcriptase (RT)-PCR 
carried out on total mRNA using primers directed to the 3' end of nra and the 5' end of 
ORF5 yielded a product that would be present only if at least some transcriptional 
readthrough occurs between nra and ORF5 (data not shown). Thus, inverted repeats 
present in the non-coding section between nra and ORF5 serve only as a weak 
transcriptional terminator, allowing a small amount of readthrough between nra and 
ORF5. However, the majority of the nifR3L transcript originates from a second 
promoter upstream of ORF5, as only the ORF5-WnL transcript could be visualized on 
the Northern blots. Because insertion of pFW1 1 in nra disrupted readthrough between 
nra and ORF5, the only promoter still present in the nra mutants was the promoter 
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ahead of 0RF5. As the 0RF5-*wL product was still increased in the nra mutants, it 
indicates that nra also has a negative regulatory effect at the promoter immediately 
upstream of ORF5. 

Northern analyses using a cpa probe detected a 5.2 kb transcript composed of 
the four genes (cpa-ORF2) located immediately upstream of and in the opposite 
orientation to nra (Fig. 2B). Transcription of the cpa operon was also increased in an 
nra mutant, suggesting its regulation by nra. However, unlike the nra and ORF5-A/nL 
transcripts, the cpa-ORF2 transcript was more abundant under anaerobic conditions, 
suggesting a possible superimposed second regulatory mechanism for this operon. 

Northern blots using a prtF2 probe detected an mRNA consistent in size with a 
monocistronic transcription of prtF2 (Fig. 2C). Although the gene is located at a distant 
site in the chromosome, increased transcription of an nra mutant was detected, and its 
expression is increased under aerobic conditions. However, the effects of nra mutation 
did not generally influence mRNA transcription rate or stability, as the recA transcript 
was not affected in the nra mutant (data not shown). 

As nra appeared to be a global negative regulator of virulence factors. Northern 
blots were used to determine whether nra and the global positive virulence factor 
regulator mga (Fig. 3) affected each other. Levels of mga mRNA were increased in the 
nra mutant (Podbielski et al., 1995) for Northern blot analysis, the nra message was 
found to be decreased in the mga mutant, which led to a corresponding increase in the 
nifR3L and cpa transcripts that are negatively regulated by nra (Fig. 3). 
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Taken together, the data from the different transcript analyses indicate that the 
nra gene product is a negative regulator of its own expression and the two adjacent 
operons as well as of prtF2 and mga (Fig. 4). The mga regulator, in turn, was 
suggested to be a positive regulator of nra expression and, thus, an indirect suppressor 
of nre-dependent genes (Fig. 4). 

With regard to the gene coding for the collagen-binding region of the group A 
streptococci, the cpa gene was demonstrated to be negatively regulated by the nra 
gene product. To determine whether CPA was involved in matrix molecule interactions, 
a recombinant CPA-maltose binding protein fusion was expressed in Escherichia coli. 
After purification and labeling, it was subjected to an enzyme-linked binding assay with 
the immobilized human matrix proteins, collagen type 1, fibronectin and laminin. Using 
the purified maltose-binding protein as a negative control, the Cpa-fusion protein bound 
significantly to collagen and, to a lesser extent, to laminin (PO.05 as determined by the 
Wilcoxon range test) (Table 3). Binding of Cpa to fibronectin and BSA remained at the 
level of the maltose-binding protein alone. Thus, like protein F2, Cpa is a second nra- 
controlled, potential GAS surface protein, exhibiting human matrix protein-binding 
properties. 

The regulation of these binding proteins by nra would predict that stationary 
phase M49 nra mutants may still contain Cpa and protein F2, as they continue to 
transcribe cpa and prtF2 upon entry into stationary phase. This could result in better 
fibronectin and collagen binding by stationary phase nra mutants. To test this 
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prediction, M49 wild type and nra mutant strains were cultured on plates under 
anaerobic conditions until stationary phase was reached. The cells were harvested, 
fluorescein isothtocyanate (FITC) labeled and the binding of the two strains to 
immobilized collagen and fibronectin was measured. The nra mutant exhibits 
significantly increased binding to both matrix proteins compared with the wild type (Fig. 
5). Collagen-binding assays conducted with unmarked cells that were detected with 
labeled polyclonal serum yielded similar results (data not shown), suggesting that the 
FITC-labeling protocol did not damage the cells or alter binding significantly. As 
recombinant Cpa was found to block the binding of FITC-labeled GAS to immobilized 
collagen (data not shown), the binding of cells to collagen is probably mediated through 
the interaction of Cpa and collagen. Overall, these data indicate that, while wild-type 
bacteria could decrease their affinity to matrix proteins when entering stationary growth 
phase, the nra mutants no longer had this ability. 

The organization of the genomic regions controlled by nra were remarkably 
similar to those flanking rofA (Fig. 1 ). The five downstream genes were more than 98% 
homologous. The upstream four-gene operon structure was conserved for both 
regulators. However, the homology of these genes was only 43-52% across serotypes. 
In rofA-containing M6, the first gene upstream was the fibronectin-binding protein gene, 
prtF. In the rofA-containing serotype M1 and the nra-containing serotype M49, the first 
gene of the upstream operon consisted of a novel gene, cpa. Protein purification and 
binding studies showed that cpa encoded a collagen-binding protein that was unable to 
bind fibronectin. Further PCR and Southern hybridization analysis of other GAS M 
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serotypes confirmed that there was no correlation between the regulator (nra/rofA) and 
the binding protein contained in the upstream operon (prfF/bpa). In addition, strains 
were found that contained both regulators and/or multiple binding proteins. For 
example, serotype M49 contained an nra/cpa pair. However, a prtF2 gene located 
elsewhere in the chromosome was monocistronically transcribed and still negatively 
regulated by nra. The presence of both the positive rofA regulator and the negative nra 
regulator in the serotype M5 and the presence of only rofA in serotype M6 may explain 
the influences of genomic background noted during studies of RofA regulation in these 
serotypes (Van Heyningen et al., 1993; Fogg and Caparon, 1997). 

The expression of nra during growth was followed using a lucif erase reporter 
gene fused to the 3* end of nra. The high-sensitivity detection of luciferase activity by a 
luminometer coupled with the 10 min half-life of luciferase in GAS (unpublished results) 
allowed the analysis of /ucMusion activity even at low cell densities, nra was 
transcribed at the highest rate during early stationary phase and was not significantly 
influenced by atmospheric conditions. This was in contrast to rofA, which has been 
described as being maximally active under aerobic conditions (Fogg and Caparon, 
1997). The differences in these results could reflect either differences in sensor 
capacity between rofA and nra or a methodological difference in the assay methods 
used. The rofA measurements were done by determining the level of an accumulated 
stable p-galactosidase reporter from a multicopy plasmid obtained using the 
experimental procedures described in the examples below. 
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In addition to the Cpa proteins above in various procedures, including the 
detection of the presence of Cpa1 or Cpa49 or their antibodies, the present invention 
also contemplates the use of the nucleic acids described herein to detect and identify 
the presence of collagen-binding GAS as well. The methods are useful for diagnosing 
group A streptococcal infections and other streptococcal diseases such as may occur 
in catheter related infections, biomaterial related infections, respiratory tract infections, 
cardiac, gastrointestinal or central nervous system infections, ocular infections, wound 
infections, skin infections, and a myriad of other diseases including conjunctivitis, 
keratitis, cellulitis, myositis, septic arthritis, osteomyelitis, bovine mastitis, and canine 
pyoderma, all as affected by group A streptococcal bacteria. 

In accordance with the invention, a preferred method of detecting the presence 
of Cpa1 or Cpa49 proteins involves the steps of obtaining a sample suspected of 
containing group A streptococci. The sample may be taken from an individual, for 
example, from one's blood, saliva, tissues, bone, muscle, cartilage, or skin. The cells 
can then be lysed, and the DNA extracted, precipitated and amplified. Detection of 
DNA from group A streptococci can be achieved by hybridizing the amplified DNA with 
a probe for GAS that selectively hybridizes with the DNA as described above. 
Detection of hybridization is indicative of the presence of group A streptococci. 

Preferably, detection of nucleic acid (e.g. probes or primers) hybridization can 
be facilitated by the use of detectable moieties. For example, the probes can be 
labeled with biotin and used in a streptavidin-coated microtiter plate assay. Other 
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detectable moieties include radioactive labeling, enzyme labeling, and fluorescent 
labeling, for example. 

DNA may be detected directly or may be amplified enzymatically using 
polymerase chain reaction (PCR) or other amplification techniques prior to analysis. 
RNA or cDNA can be similarly detected. Increased or decrease expression of Cpa1 or 
Cpa49 can be measured using any of the methods well known in the art for the 
quantification of nucleic acid molecules, such as, for example, amplification, PCR, RT- 
PCR, RNase protection, Northern blotting, and other hybridization methods. 

Diagnostic assays for Cpa1 or Cpa49 proteins or active portions thereof, such as 
consensus or variable sequence amino acid motifs, or anti-Cpal or Cpa49 antibodies 
may also be used to detect the presence of a streptococcal bacterium such as 
Streptococcus pyogenes. Assay techniques for determining protein or antibody levels 
in a sample are well known to those skilled in the art and include methods such as 
radioimmunoasssay, Western blot analysis and ELISA assays. 

The isolated, recombinant or synthetic proteins of the present invention, or 
antigenic portions thereof (including epitope-bearing fragments), or fusion proteins 
including the Cpa1 or Cpa49 proteins as described above, can be administered to 
animals as immunogens or antigens, alone or in combination with an adjuvant, for the 
production of antibodies reactive with Cpa1 or Cpa49 proteins or portions thereof. In 
addition, the proteins can be used to screen antibodies or antisera for hyperimmune 
patients from whom can be derived specific antibodies having a very high affinity for the 
proteins. 

24 



1 

Antibodies to Cpa1 or Cpa49, or to fragments thereof, can also be used in 
accordance with the invention for the specific detection of collagen-binding 
streptococcal proteins, for the prevention of infection from group A streptococci, for the 
treatment of an ongoing infection, or for use as research tools. The term "antibodies" 
as used herein includes monoclonal, polyclonal, chimeric, single chain, bispecific, 
simianized, and humanized or primatized antibodies as well as Fab fragments, 
including the products of an Fab immunoglobulin expression library. Generation of any 
of these types of antibodies or antibody fragments is well known to those skilled in the 
art. In the present case, specific polyclonal antiserum against Cpa has been generated 
which reacts with Cpa in Western immunobtots and ELISA assays and interferes with 
Cpa binding to collagen. The antiserum can be used for specific agglutination assays to 
detect bacteria which express Cpa on their surface. The antiserum does not cross-react 
with bacteria which express the fibronectin-binding protein F1 on their surface, 
although a portion of protein F1 exhibits sequence homologies to Cpa1 and Cpa49. 

Any of the above described antibodies may be labeled directly with a detectable 
label for identification and quantification of group A streptococci. Labels for use in 
immunoassays are generally known to those skilled in the art and include enzymes, 
radioisotopes, and fluorescent, luminescent and chromogenic substances, including 
colored particles such as colloidal gold or latex beads. Suitable immunoassays include 
enzyme-linked immunosorbent assays (ELISA). 

Alternatively, the antibody may be labeled indirectly by reaction with labeled 
substances that have an affinity for immunoglobulin. The antibody may be conjugated 
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with a second substance and detected with a labeled third substance having an affinity 
for the second substance conjugated to the antibody. For example, the antibody may 
be conjugated to biotin and the antibody-biotin conjugate detected using labeled avidin 
or streptavidin. Similarly, the antibody may be conjugated to a hapten and the 
antibody-hapten conjugate detected using labeled anti-hapten antibody. These and 
other methods of labeling antibodies and assay conjugates are well known to those 
skilled in the art. 

Antibodies to the collagen-binding proteins Cpa1 or Cpa49, or portions thereof, 
may also be used in production facilities or laboratories to isolate additional quantities 
of the proteins, such as by affinity chromatography. For example, antibodies to the 
collagen-binding protein Cpa1 or Cpa49 may also be used to isolate additional 

amounts of collagen. 

The isolated proteins of the present invention, or active fragments thereof, and 
antibodies to the proteins may be useful for the treatment and diagnosis of group A 
streptococcal bacterial infections as described above, or for the development of anti- 
group A streptococcal vaccines for active or passive immunization. Further, when 
administered as pharmaceutical composition to a wound or used to coat medical 
devices or polymeric biomaterials in vitro and in vivo, both the proteins and the 
antibodies are useful as blocking agents to prevent or inhibit the binding of group A 
streptococci to the wound site or the biomaterials themselves. Preferably, the antibody 
is modified so that it is less immunogenic in the patient to whom it is administered. For 
example, if the patient is a human, the antibody may be "humanized" by transplanting 
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the complimentarity determining regions of the hybridoma-derived antibody into a 
human monoclonal antibody as described, e.g., by Jones et a/., Nature 321:522-525 
(1986) or Tempest et a/. Biotechnology 9:266-273 (1991 ). 

Medical devices or polymeric biomaterials to be coated with the antibodies, 
proteins and active fragments described herein include, but are not limited to, staples, 
sutures, replacement heart valves, cardiac assist devices, hard and soft contact lenses, 
intraocular lens implants (anterior chamber or posterior chamber), other implants such 
as corneal inlays, kerato-prostheses, vascular stents, epikeratophalia devices, 
glaucoma shunts, retinal staples, scleral buckles, dental prostheses, thyroplastic 
devices, laryngoplastic devices, vascular grafts, soft and hard tissue prostheses 
including, but not limited to, pumps, electrical devices including stimulators and 
recorders, auditory prostheses, pacemakers, artificial larynx, dental implants, mammary 
implants, penile implants, cranio/facial tendons, artificial joints, tendons, ligaments, 
menisci, and disks, artificial bones, artificial organs including artificial pancreas, 
artificial hearts, artificial limbs, and heart valves; stents, wires, guide wires, intravenous 
and central venous catheters, laser and balloon angioplasty devices, vascular and 
heart devices (tubes, catheters, balloons), ventricular assists, blood dialysis 
components, blood oxygenators, urethral/ureteral/urinary devices (Foley catheters, 
stents, tubes and balloons), airway catheters (endotracheal and tracheostomy tubes 
and cuffs), enteral feeding tubes (including nasogastric, intragastric and jejunal tubes), 
wound drainage tubes, tubes used to drain the body cavities such as the pleural, 
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peritoneal, cranial, and pericardial cavities, blood bags, test tubes, blood collection 
tubes, vacutainers, syringes, needles, pipettes, pipette tips, and blood tubing. 

It will be understood by those skilled in the art that the term "coated" or "coating", 
as used herein, means to apply the protein, antibody, or active fragment to a surface of 
the device, preferably an outer surface that would be exposed to streptococcal bacterial 
infection. The surface of the device need not be entirely covered by the protein, 
antibody or active fragment. 

In addition, the present invention may be utilized as immunological 
compositions, including vaccines, and other pharmaceutical compositions containing 
the Cpa1 or Cpa49 proteins or portions thereof are included within the scope of the 
present invention. Either one or both of the Cpa1 or Cpa49 proteins, or active or 
antigenic fragments thereof, or fusion proteins thereof, can be formulated and 
packaged, alone or in combination with other antigens, using methods and materials 
known to those skilled in the art for vaccines. The immunological response may be 
used therapeutically or prophylactically and may provide antibody immunity or cellular 
immunity, such as that produced by T lymphocytes. 

The immunological compositions, such as vaccines, and other pharmaceutical 
compositions can be used alone or in combination with other blocking agents to protect 
against human and animal infections caused by or exacerbated by group A 
streptococci. In particular, the compositions can be used to protect humans against 
skin infections such as impetigo and eczema, as well as mucous membrane infections 
such as tonsillopharyngitis. In addition, effective amounts of the compositions of the 
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present invention may be used to protect against complications caused by localized 
infections such as sinusitis, mastoiditis, parapharygeal abscesses, cellulitis, necrotizing 
fascitis, myositis, streptococcal toxic shock syndrome, pneumonitis endocarditis, 
meningitis, osteomylitis, and many other sever diseases. Further, the present 
compositions can be used to protect against nonsuppurative conditions such as acute 
rheumatic fever, acute glomerulonephritis, obsessive/compulsive neurologic disorders 
and exacerbations of forms of psoriasis such as psoriasis vulgaris. The compositions 
may also be useful as appropriate in protecting both humans and other species of 
animals where needed to combat similar group A streptococcal infections. 

To enhance immunogenicity, the proteins may be conjugated to a carrier 
molecule. Suitable immunogenic carriers include proteins, polypeptides or peptides 
such as albumin, hemocyanin, thyroglobulin and derivatives thereof, particularly bovine 
serum albumin (BSA) and keyhole limpet hemocyanin (KLH), polysaccharides, 
carbohydrates, polymers, and solid phases. Other protein derived or non-protein 
derived substances are known to those skilled in the art. An immunogenic carrier 
typically has a molecular weight of at least 1,000 Daltons, preferably greater than 
10,000 Daltons. Carrier molecules often contain a reactive group to facilitate covalent 
conjugation to the hapten. The carboxylic acid group or amine group of amino acids or 
the sugar groups of glycoproteins are often used in this manner. Carriers lacking such 
groups can often be reacted with an appropriate chemical to produce them. Preferably, 
an immune response is produced when the immunogen is injected into animals such as 
mice, rabbits, rats, goats, sheep, guinea pigs, chickens, and other animals, most 
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preferably mice and rabbits. Alternatively, a multiple antigenic peptide comprising 
multiple copies of the protein or polypeptide, or an antigenically or immunologically 
equivalent polypeptide may be sufficiently antigenic to improve immunogenicity without 
the use of a carrier. 

The Cpa1 or Cpa49 proteins or portions thereof, or combination of proteins, may 
be administered with an adjuvant in an amount effective to enhance the immunogenic 
response against the conjugate. At this time, the only adjuvant widely used in humans 
has been alum (aluminum phosphate or aluminum hydroxide). Saponin and its purified 
component Quit A, Freund's complete adjuvant and other adjuvants used in research 
and veterinary applications have toxicities which limit their potential use in human 
vaccines. However, chemically defined preparations such as muramyl dipeptide, 
rmxttphosphoryl lipid A, phospholipid conjugates such as those described by 
Gcodman-Snitkoff era/. J. Immunol. 147:410-415 (1991) and incorporated by reference 
herein, encapsulation of the conjugate within a proteoliposome as described by Miller 
et a/., J. Exp. Med. 176:1739-1744 (1992) and incorporated by reference herein, and 
encapsulation of the protein in lipid vesicles such as Novasome™ lipid vesicles (Micro 
Vescular Systems, Inc., Nashua, NH) may also be useful. 

The term •vaccine" as used herein includes DNA vaccines in which the nucleic 
acid molecule encoding for a collagen-binding Gas protein, such as the nucleic acid 
sequences disclosed herein as SEQ ID NOS. 1 or 3, as used in a pharmaceutical 
composition is administered to a patient. For genetic immunization, suitable delivery 
methods known to those skilled in the art include direct injection of plasmid DNA into 
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muscles (Wolff ef al, Hum. Mol. Genet 1:363, 1992), delivery of DNA complexed with 
specific protein carriers (Wu ef a/., J. Biol. Chem. 264:16985, 1989), coprecipitation of 
DNA with calcium phosphate (Benvenisty and Reshef, Proc. Natl. Acad. Sd. 83:9551, 
1986), encapsulation of DNA in liposomes (Kaneda ef a/., Science 243:375, 1989), 
particle bombardment (Tang ef a/., Nature 356:152, 1992 and Eisenbraun ef a/., DNA 
Cell Biol. 12:791, 1993), and in vivo infection using cloned retroviral vectors (Seeger ef 
a/., Proc. Natl. Acad. Sci. 81:5849, 1984). 

In another embodiment, the invention is a polynucleotide which comprises 
contiguous nucleic acid sequences capable of being expressed to produce a gene 
product upon introduction of said polynucleotide into eukaryotic tissues in vivo. The 
encoded gene product preferably either acts as an immunostimulant or as an antigen 
capable of generating an immune response. Thus, the nucleic acid sequences in this 
embodiment encode an immunogenic epitope, and optionally a cytokine or a T-cell 
costimulatory element, such as a member of the B7 family of proteins. 

There are several advantages of immunization with a gene rather than its gene 
product. The first is the relative simplicity with which native or nearly native antigen can 
be presented to the immune system. Mammalian proteins expressed recombinantly in 
bacteria, yeast, or even mammalian cells often require extensive treatment to ensure 
appropriate antigenicity. A second advantage of DNA immunization is the potential for 
the immunogen to enter the MHC class I pathway and evoke a cytotoxic T cell 
response. Immunization of mice with DNA encoding the influenza A nucleoprotein (NP) 
elicited a CD8* response to NP that protected mice against challenge with heterologous 
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strains of flu. (See Montgomery, D. L. er a/., Cell Mol Biol, 43(3):285-92, 1997 and 
Ulmer, J. ef a/., Vaccine, 15(8)792-794, 1997.) 

Cell-mediated immunity is important in controlling infection. Since DNA 
immunization can evoke both humoral and cell-mediated immune responses, its 
greatest advantage may be that it provides a relatively simple method to survey a large 
number of S. pyogenes genes for their vaccine potential. 

Pharmaceutical compositions containing the Cpa1 or Cpa49 proteins or portions 
thereof, nucleic acid molecules, antibodies, or fragments thereof, may be formulated in 
combination with a pharmaceutical excipient or carrier such as saline, dextrose, water, 
glycerol, ethanol, other therapeutic compounds, and combinations thereof. The 
formulation should be appropriate for the mode of administration. The compositions 
are useful for interfering with, modulating, or inhibiting binding interactions between 
streptococcal bacteria and collagen on host cells. 

The amount of expressible DNA or transcribed RNA to be introduced into a 
vaccine recipient will have a very broad dosage range and may depend on the strength 
of the transcriptional and translational promoters used. In addition, the magnitude of 
the immune response may depend on the level of protein expression and on the 
immunogenicity of the expressed gene product. In general, effective dose ranges of 
about 1 ng to 5 mg, 100 ng to 2.5 mg, 1 *ig to 750 ng, and preferably about 10 ng to 
300 \ig of DNA is administered directly into muscle tissue. Subcutaneous injection, 
intradermal introduction, impression through the skin, and other modes of 
administration such as intraperitoneal, intravenous, or inhalation delivery are also 
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suitable. It is also contemplated that booster vaccinations may be provided. Following 
vaccination with a polynucleotide immunogen, boosting with protein immunogens such 
as the Cpa1 or Cpa49 gene product is also contemplated. 

The polynucleotide may be "naked*, that is, unassociated with any proteins, 
adjuvants or other agents which affect the recipient's immune system. In this case, it is 
desirable for the polynucleotide to be in a physiologically acceptable solution, such as, 
but not limited to, sterile saline or sterile buffered saline. Alternatively, the DNA may be 
associated with liposomes, such as lecithin liposomes or other liposomes known in the 
art, as a DNA-liposome mixture, or the DNA may be associated with an adjuvant known 
in the art to boost immune responses, such as a protein or other carrier. Agents which 
assist in the cellular uptake of DNA, such as, but not limited to, calcium ions, may also 
be used. These agents are generally referred to herein as transection facilitating 
reagents and pharmaceutically acceptable carriers. Techniques for coating 
microprojectiles coated with polynucleotide are known in the art and are also useful in 
connection with this invention. For DNA intended for human use it may be useful to 
have the final DNA product in a pharmaceutically acceptable carrier or buffer solution. 
Pharmaceutically acceptable carriers or buffer solutions are known in the art and 
include those described in a variety of texts such as Remington's Pharmaceutical 
Sciences. 

It is recognized by those skilled in the art that an optimal dosing schedule for a 
DNA vaccination regimen may include as many as five to six, but preferably three to 
five, or even more preferably one to three administrations of the immunizing entity 
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given at intervals of as few as two to four weeks, to as long as five to ten years, or 
occasionally at even longer intervals. 

Suitable methods of administration of any pharmaceutical composition disclosed 
in this application include, but are not limited to, topical, oral, anal, vaginal, 
intravenous, intraperitoneal, intramuscular, subcutaneous, intranasal and intradermal 
administration. 

For topical administration, the composition is formulated in the form of an 
ointment, cream, gel, lotion, drops (such as eye drops and ear drops), or solution (such 
as mouthwash). Wound or surgical dressings, sutures and aerosols may be 
impregnated with the composition. The composition may contain conventional 
additives, such as preservatives, solvents to promote penetration, and emollients. 
Topical formulations may also contain conventional carriers such as cream or ointment 

bases, ethanol, or oleyl alcohol. 

In a preferred embodiment, a vaccine is packaged in a single dosage for 
immunization by parenteral (i.e., intramuscular, intradermal or subcutaneous) 
administration or nasopharyngeal (i.e., intranasal) administration. The vaccine is most 
preferably injected intramuscularly into the deltoid muscle. The vaccine is preferably 
combined with a pharmaceutically acceptable carrier to facilitate administration. The 
carrier is usually water or a buffered saline, with or without a preservative. The vaccine 
may be lyophilized for resuspension at the time of administration or in solution. 

Microencapsulation of the protein will give a controlled release. A number of 
factors contribute to the selection of a particular polymer for microencapsulation. The 
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reproducibility of polymer synthesis and the microencapsulation process, the cost of the 
microencapsulation materials and process, the toxicological profile, the requirements 
for variable release kinetics and the physicochemical compatibility of the polymer and 
the antigens are all factors that must be considered. Examples of useful polymers are 
polycarbonates, polyesters, poiyurethanes, polyorthoesters, polyamides, poly (D,L- 
lactide-co-glycolide) (PLGA) and other biodegradable polymers. The use of PLGA for 
the controlled release of antigen is reviewed by Eldridge et al., Current Topics in 
Microbiology and Immunology, 146:59-66 (1989). 

The preferred dose for human administration is from 0.01 mg/kg to 10 mg/kg, 
preferably approximately 1 mg/kg. Based on this range, equivalent dosages for heavier 
body weights can be determined. The dose should be adjusted to suit the individual to 
whom the composition is administered and will vary with age, weight and metabolism of 
the individual. The vaccine may additionally contain stabilizers or pharmaceutically 
acceptable preservatives, such as thimerosal (ethyl(2-mercaptoben2oate-S)mercury 
sodium salt) (Sigma Chemical Company, St. Louis, MO). 

When labeled with a detectable biomolecule or chemical, the collagen-binding 
proteins described herein are useful for purposes such as in vivo and in vitro diagnosis 
of streptococcal infections or detection of group A streptococcal bacteria. Laboratory 
research may also be facilitated through use of such protein-label conjugates. Various 
types of labels and methods of conjugating the labels to the proteins are well known to 
those skilled in the art. Several specific labels are set forth below. The labels are 
particularly useful when conjugated to a protein such as an antibody or receptor. For 
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example, the protein can be conjugated to a radiolabel such as, but not restricted to, 
32p »H t *S, 125 l, or 131 l. Detection of a label can be by methods such as scintillation 
counting, gamma ray spectrometry or autoradiography. 

Bioluminescent labels, such as derivatives of firefly luciferin, are also useful. 
The bioluminescent substance is covalently bound to the protein by conventional 
methods, and the labeled protein is detected when an enzyme, such as luciferase, 
catalyzes a reaction with ATP causing the bioluminescent molecule to emit photons of 
light Fluorogens may also be used to label proteins. Examples of fluorogens include 
fluorescein and derivatives, phycoerythrin, allo-phycocyanin, phycocyanin, rhodamine, 
and Texas Red. The fluorogens are generally detected by a fluorescence detector. 

The protein can alternatively be labeled with a chromogen to provide an enzyme 
or affinity label. For example, the protein can be biotinylated so that it can be utilized in 
a biotin-avidin reaction, which may also be coupled to a label such as an enzyme or 
fluorogen. For example, the protein can be labeled with peroxidase, alkaline 
phosphatase or other enzymes giving a chromogenic or fluorogenic reaction upon 
addition of substrate. Additives such as 5-amino-2,3-dihydro-1 ,4-phthalazinedione 
(also known as Luminof) (Sigma Chemical Company, St. Louis, MO) and rate 
enhancers such as p-hydroxybiphenyl (also known as p-phenylphenol) (Sigma 
Chemical Company, St Louis, MO) can be used to amplify enzymes such as 
horseradish peroxidase through a luminescent reaction; and luminogeneic or 
fluorogenic dioxetane derivatives of enzyme substrates can also be used. Such labels 
can be detected using enzyme-linked immunoassays (ELISA) or by detecting a color 
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change with the aid of a spectrophotometer. In addition, proteins may be labeled with 
colloidal gold for use in immunoelectron microscopy in accordance with methods well 
known to those skilled in the art. 

The location of a ligand in cells can be determined by labeling an antibody as 
described above and detecting the label in accordance with methods well known to 
those skilled in the art, such as immunofluorescence microscopy using procedures 
such as those described by Warren and Nelson (Wo/. Cell. Biol., 7: 1326-1337, 1987). 

In addition to the therapeutic compositions and methods described above, the 
Cpa1 and Cpa49 proteins or active portions or fragments thereof, nucleic acid 
molecules or antibodies are useful for interfering with the initial physical interaction 
between a pathogen and mammalian host responsible for infection, such as the 
adhesion of bacteria, to mammalian extracellular matrix proteins such as collagen on 
in-dwelling devices or to extracellular matrix proteins in wounds; to block Cpa1 or 
Cpa49 protein-mediated mammalian cell invasion; to block bacterial adhesion between 
collagen and bacterial Cpa1 or Cpa49 proteins or portions thereof that mediate tissue 
damage; and, to block the normal progression of pathogenesis in infections initiated 
other than by the implantation of in-dwelling devices or surgical techniques. 

The Cpa1 or Cpa49 proteins, or active fragments thereof, are useful in a method 
for screening compounds to identify compounds that inhibit collagen binding of 
streptococci to host molecules. In accordance with the method, the compound of 
interest is combined with one or more of the Cpa1 or Cpa49 proteins or fragments 
thereof and the degree of binding of the protein to collagen or other extracellular matrix 
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proteins is measured or observed. If the presence of the compound results in the 
inhibition of protein-collagen binding, for example, then the compound may be useful 
for inhibiting group A streptococci in vivo or in vitro. The method could similarly be 
used to identify compounds that promote interactions of GAS with host molecules. The 
method is particularly useful for Identifying compounds having bacteriostatic or 
bacteriocidal properties. 

For example, to screen for GAS agonists or antagonists, a synthetic reaction 
mixture, a cellular compartment (such as a membrane, cell envelope or cell wall) 
containing one or more of the Cpa1 or Cpa49 proteins or fragments thereof and a 
labeled substrate or ligand of the protein is incubated in the absence or the presence of 
a compound under investigation. The ability of the compound to agonize or antagonize 
the protein is shown by a decrease in the binding of the labeled ligand or decreased 
production of substrate product. Compounds that bind well and increase the rate of 
product formation from substrate are agonists. Detection of the rate or level of 
production of product from substrate may be enhanced by use of a reporter system, 
such as a colorimetric labeled substrate converted to product, a reporter gene that is 
responsive to changes in Cpa1 or Cpa49 nucleic acid or protein activity, and binding 
assays known to those skilled in the art. Competitive inhibition assays can also be 
used. 

Potential antagonists include small organic molecules, peptides, polypeptides 
and antibodies that bind to Cpa1 or Cpa49 nucleic acid molecules or proteins or 
portions thereof and thereby inhibit their activity or bind to a binding molecule (such as 
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collagen to prevent the binding of the Cpa1 or Cpa49 nucleic add molecules or 
proteins to its ligand. For example, a compound that inhibits Cpa1 or Cpa49 activity 
may be a small molecule that binds to and occupies the binding site of the Cpa1 or 
Cpa49 protein, thereby preventing binding to cellular binding molecules, to prevent 
normal biological activity. Examples of small molecules include, but are not limited to, 
small organic molecule, peptides or peptide-like molecules. Other potential antagonists 
include antisense molecules. Preferred antagonists include compounds related to and 
variants or derivatives of the Cpa1 or Cpa49 proteins or portions thereof. The nucleic 
acid molecules described herein may also be used to screen compounds for 
antibacterial activity. 

The invention further contemplates a kit containing one or more Cpa1 or Cpa49- 
specific nucleic acid probes, which can be used for the detection of collagen-binding 
proteins from group A streptococci in a sample, or for the diagnosis of GAS bacterial 
infections. Such a kit can also contain the appropriate reagents for hybridizing the 
probe to the sample and detecting bound probe. In an alternative embodiment, the kit 
contains antibodies specific to either or both Cpa1 and Cpa49 proteins or active 
portions thereof which can be used for the detection of group A streptococci. 

In yet another embodiment, the kit contains either or both the Cpa1 and Cpa49 
proteins, or active fragments thereof, which can be used for the detection of GAS 
bacteria or for the presence of antibodies to collagen-binding GAS proteins in a 
sample. The kits described herein may additionally contain equipment for safely 
obtaining the sample, a vessel for containing the reagents, a timing means, a buffer for 
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diluting the sample, and a colorimeter, reflectometer, or standard against which a color 
change may be measured. 

In a preferred embodiment, the reagents, including the protein or antibody, are 
lyophilized, most preferably in a single vessel. Addition of aqueous sample to the 
vessel results in solubilization of the lyophilized reagents, causing them to react. Most 
preferably, the reagents are sequentially lyophilized in a single container, in 
accordance with methods well known to those skilled in the art that minimize reaction 
by the reagents prior to addition of the sample. 
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Table 1. Sequence homologies of the ORFs of the GAS nr» genomic region. 



GASORF 



(provisional! 


Homologous protein sequence; 


Percentage 




number/designation 


source organism 


identity/similarity 


Reference . 


1 (msmRL) 


Multiple sugar metabolism regulator; 
Streptococcus mutans 


34/59 


Russetier*/. (1992) 


2 (ORF2) 


No homologous sequence identified 








C-ter minus of electron transfer flavoprotem 1a; 


27/47 


Chen and Swensoo (1994) 


Methytophilus methyfoUophus 








Signal peptidase 1; 


46/67 


Cregg etai. P996) 




Staphylococcus aureus 








Ptc&ein F; 


28/41 


Han$Ki and Caparon (1992) 




Streptococcus pyogenes 






nr» 


Rof A regulator of protein F, 
Streptococcus pyogenes 


63/73 


Fogg etai (1994) 


5(§RF5) 


Hypothetical 31.8 kDa protein m ItsH-cysK intergenic region; 
Bacillus suom 


35/62 


Ogasawara era/. (1994) 


6|/yflR3L) 


Nitrogenase regulator; 


32/46 


Machadoefa/. (1995) 




AzospifHhjm bmsilense 








Hypothetical 37.1 kDa protein. 


59/75 


Ogasawara etal. (1994) 


7^U 


Bacillus subtiffs 






dA/dG-kinase: 
Lactobacillus acidophilus 


57/74 


Ma e\ al (1995) 


^(ls£>L) 


Singfe-strand DNA-bindiog protein; 
Bacillus subtitis 


50/65 


Rikkeer a/. (1995) 




Phenytalanyt-tANA synthase beta subunit; 
Batif/us subtiiis 


49/62 


Brakhage e/ a/. (1990) 
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Table %, Presence o! nra/ rof^-associated) 
genes in selected OAS serotype strains. 



Regulatory genes Structural genes 
Serotype strain nra (otA prtP prtf? cpa 







+ 






+ 


M2 












M3 


+ 


+ 


i- 






M4 












MS 




+ 








M6 






+ 






M12 




+ 




+ 




M18 


+ 






+ 




M24 












M4$ 


+ 






+ 


+ 



Genes were detected with specific probos used for genomic Southern bk>l hybridizations as 
well as by specific PCR assays. Sequences of primers used for ©nautical PCRs or to generate 
probes are shown m Table 4. 

+. hybridizetion/PCR product detectable: no hybridization/PCR product detectable. 



Table 3. Human matrix proteirvbindino. activity — ' — — " 

of a recombinant Cpa protein. Collagen Roronecttn Urranm BSA 



Opa/Mal fusion 03732 0011 0.074 z 0.008 0.115 i 0.036 0.049 T 0.021 

Mai 0.104 ±0 007 0.042 £0.002 0.060 Z 0.006 0.033 t 0.005 

HRPO (negative control) 0.040^0.013 0.036 ±0.015 0.038 zO.0 12 0.028 ♦ 0006 

The binding activity of a purified Cpa-maftose binding protein fusion and the maftose-bimftng 
protein alone (Mai), both coupled to horseradish peroxidase <HRPO), were compared with that 
of HRPO alone. The assay was performed in an £USA format as described in Experimental 
procedures. The results were read as 0& 4 *> values. The data were analysed by the Wiicoxon 
range test, and the binding of the Cpa-Mal fusion to collagen type I and to laminin was found to 
be statistically significant (P<0.05). 
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Table 4. List of oligonucleotides used in this work. 



Designation 



Sequence (5' to 3') 



Position numbers 



Reference 



nra FOR 
nra REV 
tot A FOR 
rofA REV 
cpa FOR 
cpa REV 
prtF FOR 
prtF REV 
p«F2 FOR 
pnF2 REV 
nif R3 FOR 
nitR3 REV 

!i :jw»ins FOR 
iira-ins REV 
, epa-tns FOR 
cpa ins REV 
^ rvalue FORI 
s IjnraLuc FOR5 
;= braluc REV1 

Jcpa^pMAL FOR 
« cpa-pMAL REV 

^RT-nra FOR 
L RT-nra REV 
i ^RT-orf$ FOR 
- RT*orf5 REV 



ATTTTTTCTCATGTTGCTA 

GTTTAGAATGGTTTAATTG 

GCCAATAACTGAGGTAGC 

6GCTTTTGCTCTTTTAG6T 

AGTTCACAAGTTGTCTACTG 

AAATAATAGATAGCAAGCT6 

ATTAATGCCAGAGTTAGATG 

CGATTCTCTTCCACTTTG 

TACTCTGTTAAAGAAGTAACTG 

CTCAGAGTCAQTTTCTGG 

GGATTTTGCCTACTACTTA 

GTGGAATATCTAAAACAGAC 

TTTTATTGGAGACTAGAAGTTTA 

AGCAA6CCACTGATTTAC 

TGCAAAAGAGGGATAAAAC 

GAAGCAGTAGACAACTTGTG 

TAAACTAAAGTAGCTTAGCA 

ATGGAACGTCATCACAAC 

CAGATACCTAAAAATAAACG 

GCTGAAGAACAATCAGTACCA 

TTAGTCATTTTTTAACCCTTTACG 



CTTTTTACTTATTAAGAGATGA 
CTCGTTTAGAAAATCTTG 
AAAATAATTAAATCAATAGCA 
CCACAGAGATAATGT6T 



6474-6492 
7308-7290 
141-158 
995-977 
3435-3454 
3727-3708 
1414-1433 

2259- 2242 

2260- 2281 
3166-3151 
©443-8481 
9313-9294 

6325-6347 
7461_74$4 
5932-5914 
4707-4726 
5953-5972 
6688-6705 
7930-7911 
S798-5778 
3705-3728 



7669-7690 
7886-7869 
6030-8050 
8258^8241 



This study 
This study 
Fogg et at. (1994) 
Fogg etai (1994) 
This study 
This study 

Hans* and Caparon (1992) 
Hanakt and Caparon (1992) 
Jaffe etaL (1996) 
Jaffe et aL (1996) 
This study 
This Study 

This study 
This study 
This study 
This study 
This study 
This study 
This study 
This study 
This study 



This study 
This study 
This study 
This study 



Oligonucleotides were used as primers to PCR amplify probes for Southern and Northern blot hybridizations (A), genomic fragments tor cloning 
into pFWtl, pFWt 1-Juc or pMAl-c2 plasmtds (B) and primers for RT-PCR to detect n/a- and orf5-speeif»c transcripts (C). 
Primer pairs nra-ins FOR/REV, cpa^ins FOR/REV. nraLuc FOR/REV and cpa-pMAL FOR/REV were 5' extended with SphUSpel. NheMBamH 
and BamHUPsti sites, respectively, to facilitate forced doning of the resulting PCR products The nucleotide position numbers refer to the GAS nra 
genomic sequence as submitted to GenBank or me cited publications. 
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EXAMPLES 

The following examples are provided which exemplify aspects of the preferred 
embodiments of the present invention. It should be appreciated by those of skill in the 
art that the techniques disclosed in the examples which follow represent techniques 
discovered by the inventors to function well in the practice of the invention, and thus 
can be considered to constitute preferred modes for its practice. However, those of 
skill in the art should, in light of the present disclosure, appreciate that many changes 
can be made in the specific embodiments which are disclosed and still obtain a like or 
similar result without departing from the spirit and scope of the invention. 

EXAMPLE 1: ISOLATION OF GROUP A STREPTOCOCCAL PROTEINS 

A. Bacterial strains and culture conditions 

GAS serotype M49 strain CS101 was provided by P. Cleary, MN, USA. 
Serotypes M1, M2, M3, M4, M5, M6, M12, M18 and M24 GAS strains T1/195/2, 
T2/44/RB4.119, B93Q/60/2, 75-194, T5B/126/3, S43/192/1, T12/126/4, J17C/55/1 and 
71-694 were obtained from D. Johnson, MN, USA. The M49 GAS isolates B737/137/1, 
49-49/123, 88-299, 90-063, 90-397, 89-288, 90-306 and 8314/1946 have been 
described by Kaufhold et al. (1992). £. coli strain Blue MRF served as a host for phage 
Lambda ZAP Express. E.coli stain DH5a was used as host for plasmids pFW11 and 
pMAL-c2. 

E. coli DH5a isolates transformed with pFW11 or pMAL-c2 derivatives were 
grown on disk sensitivity testing agar (Unipath) supplemented with 100 mglM 
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spectirwmyecin or 50 mgl" 1 ampicillin respectively. £ coff Blue MRF strains infected 
with recombinant lambda phages were grown in NZ casamino acids/yeast extract (NZY) 
agar according to the instructions of the supplier (Stratagene). All E co// cultures were 
grown in cultures were grown at 37°C in ambient air. 

GAS strains were cultured in TH broth and on TH agar (Unipath) both 
supplemented with 0.5% yeast extract (THY), or in chemically defined medium (CDM) 
(van de Rijn and Kessler, 1980). The GAS mutant strains were maintained in medium 
containing 60 mgl' 1 spectinomycin. Culture conditions for GAS strains were a 
temperature of 37°C and a 5% COzQOK Oa atmosphere unless specifically described. 

B. Vectors 

E. coli phage Lambda ZAP Express (BamHI arms, CIP treated) was purchased 
from Stratagene and used according to the instructions of the manufacturer. 

Plasmid pFW11 was used for insertional mutagenesis as described by 
Podbielski ef a/. (1996c). Plasmid pFW11 multiple cloning site (MCS) 1. The 
luciferase (luc) box was amplified by PCR using plasmid pUSL2/5 (GrSfe ef a/., 1996) 
as template and oligonucleotides lucFor 

(5'GACGATCTCGAGGAGGTAAATGAAGACGCCAAAAAC-3*) and lucRev 
(5'GACGATAAGCTTTTACAATTTGGACTTTCCG-3 ) as primers. The luciferase box 
contained an optimized Shine-Dalgamo sequence as well as the luc start and stop 
codons. Cloning of GAS genomic fragments into MCS1 of pFW11-luc followed the 
protocol outlined by Podbielski era/. (1996c). 
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Plasmid pMAL-c2 was used for expression of the cpa gene and was purchased 
from New England Biolabs. It was used according to the instructions of the 
manufacturer. 

C. DNA techniques 

Chromosomal GAS DNA was prepared by the method of Martin et al. (1990). 
Plasmid DNA preparations and genetic manipulations as well as other conventional 
DNA techniques were performed as described by Ausabel et al. (1990). 
Transformation of GAS strains by electroporation was according to the protocol of 
Caparon and Scott (1991). 

Usage of the serotype M49 GAS Lambda library for sequencing of recombinant 
GAS genomic DNA followed the protocol of Podbielski et al. (1996b). Oligonucleotides 
used for sequencing and PCR were designed with the aid of OLIGO 5.0 (National 
Biosciences), synthesized on an OLIGO 1000 DNA synthesizer (Beckman) and 
desalted through NAP5 columns (Pharmacia). The parameters of PCR assays, direct 
labeling of PCR products with DIG-dUTP, analysis of PCR products and parameters for 
direct sequencing of PCR products were as described previously (Podbielski ef al., 
1995). 

DNA sequences were compiled and analyzed with PC GENE 6.8 
(IntelliGenetics). Sequence comparisons were performed using the BLAST programs 
and the databases of the GenBank data library as well as the Streptococcal Genome 
Sequencing Project of the University of Oklahoma, USA (Roe ef al., 1997). 
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D. RNA preparation and analysis 

For RNA preparations, serotype M49 GAS strains were grown aerobically to 
ODew values of 0.2, 0.5, and 0.9, which corresponded to early, medium and late 
logarithmic growth phases respectively. Before preparation, cells were sedimented by 
2 min centrifugation at 4°C, suspending in ice-cold 20mM Tris (pH 7.5)/6mM 
MgClaCOmM sodium azide/400 mgl" 1 chloramphenicol. RNA preparation followed the 
protocol of Shaw and Clewell (1985). Denaturing agarose gel electrophoresis and 
Northern blot hybridizations with DIG-dUTP-labeled probes were performed as 
described previously (Pidbielski et ai, 1995). Probes were generated by asymmetric 
PCR, using only 10* 2 to 10" 3 of the normal amounts of the appropriate upstream 
primers. 

RT-PCR was performed with RNA after 30 min exposure to DNase I according 
to the manufacturer's protocol (Boehringer Mannheim). Reverse transcription using 
Superscript II RT-polymerase was done as described by the manufacturer (Gibco BRL) 
using the appropriate downstream primers (Table 4). One microlite of the RT assay 
was used as template for PCR employing the PCR primers listed in Table 4. Controls 
included primer control with genomic DNA template, reagent contamination control by 
running both reactions without RNA template, and DNA contamination control by 
running both reaction without RT-polymerase. 
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E. DNA mutagenesis experiments 

Insertional i reactivation of the nra gene was performed using a recombinant 
pFW11 plasmid following the strategy and specific methods according to Podbielski et 
al. (1996c). The primers nra-insFOR/REV and cpa-insFOR/REV annealing to nra and 
cpa internal sequences were used to generate PCR products, which were cloned into 
pFW1 1 via the Sphl/Spel or Nhel/BamHI sites of MCS1 . Specific integration of the nra 
recombinant plasmid into the GAS genome was confirmed by Southern blot 
hybridization using BamHI-, Spei- and XbaZ-digested genomic DNA and probes specific 
for the integrated antibiotic resistance marker aad9 as well as for the duplicated nra 
sequence. 

Construction of the nra promoter-! ucif erase fusions was performed using plasmid 
pFW1 1-luc (this study) and PCR products comprising the 3' end of the nra gene or the 
entire nra promoter and structural gene region. For amplification of the PCR product, 
primers nral_ucFOR5 or nraLucFORI, and nraLucREVI were used (Table 4). The 
primers annealed in the central region of the nra gene or immediately upstream of the 
cpa gene (Fig. 1) and at the stop codon of the nra gene. Using Nhel and BamHI sites 
as 5* tags for the upstream and downstream primers, respectively, the resulting PCR 
products were cloned into the corresponding MCS1 site of pFW1 1-luc. Specific 
integration of the plasmid in the GAS genome was confirmed as shown. 
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F. Measuring adherence to immobilized human matrix proteins 

Cells grown on solid medium were prepared by spreading a 10 \i\ aliquot of 
oversight cultures onto fresh THY agar plates and incubating the plates overnight in 
ambient air, 5% CO2 or anerobic incubators. Plates were then flooded with 3 ml of 
DPBS, pH 7.4 (PBS plus 0.88 mM CaCb/0.45 mM MgCI 2 ) and incubated for 10 min at 
room temperature. Cells were suspended gently using a glass spreader, removed from 
the plate with a pipette avoiding the production of air bubbles and transferred into a test 
tube. Cells were then suspended by gentle, repeated pipetting. 

Labeling of bacteria and adhesion assays followed a protocol of Geelen et al., 
(1993). Specifically, for labeling of bacteria, thoroughly suspended cells were washed 
in 12 ml of DPBS and suspended in 2 ml of FITC solution (1 mg ml" 1 FITC in 50 mM 
sodium carbonate buffer, pH 9.2, stored in the dark and passed through a 0.2 urn pore 
size filter before use). After 20 min incubation at room temperature in the dark, cells 
were sedimented by centrifugation, washed in DPBS, suspended in 2 ml of DPBS and 
sonicated for 20 s at setting 4 in the refrigerated hollow horn of the sonifier 450 
(Branson Ultrasonic). The ODeoo values of the suspension were adjusted to 1.0 with 
DPBS, sonicated again to disruption of aggregates and kept in the dark until used. 

For immobilization of human matrix proteins, Terasaki microtitre plates were 
washed once with DPBS, pH 7.4. Then, 10 ml of 100 jig ml* 1 human fibronectin or 
collagen type 1 (Gibco BRL) was added to the wells and incubated overnight at room 
temperature in a moist chamber. 
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The preincubated Terasaki microtitre plates were washed with DPBS, and 
residual buffer was carefully removed. Then, 10 \il aliquots of FITC-labeled cell 
suspensions were added to the wells and incubated for 60 min at 37°C in a 5% 
CCh/20% O2 atmosphere. The plates were then washed five times with DPBS, and 
bound cells were fixed by flooding plates with 0.5% glutaraldehyde for 5 min. The 
plates were again washed twice with DPBS and kept in the dark until measured. The 
intensity of FITC labeling was controlled for each assay by measuring the fluorescence 
intensity of 10 pi aliquots of cells added in triplicate to uncoated DPBS-washed 
Terasaki microtitre plates and directly counted. 

Fluorescence of single wells was evaluated by processing the plates through an 
automated Cyto Fluor II fluorescence reader (PerSeptive Biosystems) operating with 
excitation and detection wavelengths of 485 nm and 530 nm respectively. Sensitivity 
gain levels of 72 or 62 were used for binding assays and FITC-labeling control 
respectively. 

For each assay, adherence to a human protein was measured for at least two 
coated plates and four replicate wells each located at different positions on the plates. 
For both matrix proteins, the assays were repeated at least four times on different days. 
To normalize the data, the following calculations were carried out. 

The four duplicates on a given plate were averaged to give a single value ("ave- 
RLU'). The ave-RLU values from the nra mutants on each plate were corrected for 
differences in FITC labeling intensity as follows: 
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ave-RLU x [(wild-type strain intensity of labeling)/(mutant strain intensity of labeling)]. 

The maximum difference for intensity of labeling was less than a factor of 2. 
Standardization across experiments was accomplished by multiplying ail values by a 
standardization factor. This standardization factor was derived by comparing all 
subsequent experiments to the first experiment using the following scheme: 

(wild-type strain intensity of labeling in assay no. 1)/(wild-type strain intensity of 
labeling in assay no. Y). 

Once calculated, all values derived in experiment Y were multiplied by the 
standardization factor. 

For comparison, unlabeled bacteria were tested for adherence to collagen type I 
and detected by a rabbit polyclonal anti-group A carbohydroxide antiserum as 
described by Gubbe (1997). 

EXAMPLE 2: Expression of a recombinant CPA protein and determination of its 
matrix protein-binding properties 

The entire cpa gene except for its leader peptide encoding portion was amplified 
by PCR using the primers cpa-pMAL FOR and cpa-pMAL REV (Table 4). The resulting 
product was cloned into the BamHI and Pst1 sites of plasmid pMAL-c2. Expression in 
the presence of 2 mM IPTG with an induction period of 4h and subsequent non- 
denaturing preparation followed a protocol of Ausubel et al. (1990). Purification of the 
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recombinant CPA-maltose binding fusion protein using a composite amylose/agarose 
matrix performed according to the instructions of the manufacturer (New England 
Biolabs). The purified fusion protein was then labeled with peroxidase as described by 
Schmidt etal. (1993). 

Microtitre plates (96-well, flat-bottom; Nunc) were coated with BSA and human 
fibronectin, type I collagen or laminin (Gibco BRL) by adding 2 iig of each protein 
dissolved in 200 \x\ of 50 mM sodium carbonate, pH 8.6, to single wells. The wells were 
washed with PBS, pH 7.8, plus 0.5%. Tween 20 and blocked with 0.01% Tween 20 
(PBS-T). 

Peroxidase-labeled Cpa-maltose binding protein fusion and recombinant purified 
maltose-binding protein (for control of specific binding of the bacteria) were added to 
the wells for 2 h at room temperature. Non-conjugated peroxidase at a 1 :300 dilution in 
PBS-T was used as a negative control. After washing with PBS-T, all wells were 
incubated with ortho-phenylenediamine (Sigma) and measured in an ELISA reader 
(SLT RainBou) set at 492 nm detection wavelength as outlined by Tijssen (1985). All 
assays were repeated on at least three independent occasions. 
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SEQUENCE LISTING 

<110> PODBIELSKI, ANDREAS 

<120> COLLAGEN-BINDING PROTEINS FROM STREPTOCOCCUS PYOGENES 
<130> P06628US0/BAS 

<140> 00/000,000 
<141> 2000-01-31 

<160> 4 

<170> Patentln Ver. 2,0 

<210> 1 
<211> 2274 
<212> DNA 

<213> Streptococcus pyogenes 
<400> 1 

atgaaaaaaa caaggtttcc aaataagctt aatactctta atactcaaag ggtattaagt 60 
aaaaactcaa aacgatttac tgtcacttta gtgggagtct ttttaatgat cttcgctttg 120 
gtaacttcca tggttggtgc taagactgtt tttggtttag tagaatcctc gacgccaaac 180 
gcaataaatc cagattcaag ttcggaatac agatggtatg gatatgaatc ttatgtaaga 240 
gggcatccat attataaaca gtttagagta gcacacgatt taagggttaa cttagaagga 300 
agtagaagtt atcaagttta ttgctttaat ttaaagaaag catttcctct cggatcagat 360 
agtagtgtta aaaagtggta taaaaaacat gatggaatct ctacaaaatt tgaagattat 420 
gcgatgagcc ctagaattac gggagatgag ctaaatcaga agttacgagc tgttatgtat 480 
aatggacatc cacaaaatgc caatggtatt atggaaggct tggaaccctt gaatgctatc 540 



56 



agagttacac aagaggcggt atggtactat 
gaaagtttta aaagggagtc agaaagtaac 
cgtcaagctt tgaagcaact gattgatccg 
ccggatgatt ttcagctaag tatttttgag 
ggataccaaa atcttttgag tggtggttta 
ccaccaatgc ctccaaatca acctcaaacg 
ggtgattact ctaaattgct tgaaggtgca 
agttttcaag cgagagtgtt tagcagtaat 
ggaacttata ctttaactga attgaattct 
acttttaagg ttgaagctgg caaagtgtat 
cccaataaag agatagtaga gccttactca 
agcgttttaa ctacacaaaa ctatgcaaaa 
tcacaggttg tctattgctt taatgcagat 
gggaaaacaa tgactccaga ctttacaaca 
cgtgacctct ttaaatatac tgtgaaacca 
catatcaaaa aagtaattga gaagggttac 
ggtctaactg agacacaatt gcgtgcggct 
agtgctgaat tagataagga taaactaaaa 
agtactttag cagttgctaa aatccttgta 
ctaactgacc ttgatttctt tattccgaat 
cagtggcatc cagaagattt agttgatatt 
cctgtaactc ataatttaac attgagaaaa 
aaagatttcc attttgaaat tgaattaaaa 
gttaaaacag ataaaacaaa cctcgaattt 
catggggaaa gtttaacact tcaaggttta 
acagattctg aaggctataa ggttaaagtt 
tcaaaaacag gaataacaag tgatgagaca 
gttcctacag gagttgatca aaagatcaat 



tctgataatg ctcctatttc taatccagat 600 
ttggttagta cttctcaatt atctttgatg 660 
aatttggcaa ctaaaatgcc aaaacaagtt 720 
tctgaggaca agggagataa atataataaa 780 
gttcctacta aaccaccaac tccaggagac 840 
acttcagtac ttattagaaa gtatgctata 900 
acattacagt tgacagggga taacgtgaat 960 
gatattggag aaagaattga actatcagat 1020 
ccagctggtt atagtatcgc agagccaatc 1080 
actattattg atggaaaaca gattgaaaat 1140 
gtagaagcat ataatgattt tgaagaattt 1200 
ttttattatg caaaaaataa aaatggaagt 1260 
ctaaaatctc caccagactc tgaagatggt 1320 
ggagaagtaa aatacactca tattgcaggt 1380 
agagataccg atcctgacac tttcttaaaa 1440 
agggaaaaag gacaagctat tgagtatagt 1500 
actcagttag caatatatta tttcactgat 1560 
gactatcatg gttttggaga catgaatgat 1620 
gaatacgctc aagatagtaa tcctccacag 1680 
aacaataaat atcaatctct tattggaact 1740 
attcgtatgg aagataaaaa agaagttata 1800 
acggtgactg gtttagctgg tgacagaact 1860 
aataataagc aagaattgct ttctcaaact 1920 
aaagatggta aagcaaccat taatttaaaa 1980 
ccagaaggtt attcttacct tgtcaaagaa 2040 
aatagccaag aagtagcaaa tgctacagtt 2100 
cttgcttttg aaaataataa agagcctgtt 2160 
ggctatctag ctttgatagt tatcgctggt 2220 
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atcagtttgg ggatctgggg aattcacacg ataaggataa gaaaacatga ctag 



2274 



<210> 2 
<211> 757 
<212> PRT 

<213> Streptococcus pyogenes 
<400> 2 

Met Lys Lys Thr Arg Phe Pro Asn Lys Leu Asn Thr Leu Asn Thr Gin 
15 10 15 

Arg Val Leu Ser Lys Asn Ser Lys Arg Phe Thr Val Thr Leu Val Gly 
20 25 30 

Val Phe Leu Met He Phe Ala Leu Val Thr Ser Met Val Gly Ala Lys 
35 40 45 

Thr Val Phe Gly Leu Val Glu Ser Ser Thr Pro Asn Ala He Asn Pro 
50 55 60 

Asp Ser Ser Ser Glu Tyr Arg Trp Tyr Gly Tyr Glu Ser Tyr Val Arg 
65 70 75 80 

Gly His Pro Tyr Tyr Lys Gin Phe Arg Val Ala His Asp Leu Arg Val 
85 90 95 

Asn Leu Glu Gly Ser Arg Ser Tyr Gin Val Tyr Cys Phe Asn Leu Lys 
100 105 110 
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Lys Ala Phe Pro Leu Gly Ser Asp Ser Ser Val Lys Lys Trp Tyr Lys 
115 120 125 



Lys His Asp Gly He Ser Thr Lys Phe Glu Asp Tyr Ala Met Ser Pro 
130 135 140 

Arg He Thr Gly Asp Glu Leu Asn Gin Lys Leu Arg Ala Val Met Tyr 
145 150 155 160 

Asn Gly His Pro Gin Asn Ala Asn Gly He Met Glu Gly Leu Glu Pro 
165 170 175 

Leu Asn Ala He Arg Val Thr Gin Glu Ala Val Trp Tyr Tyr Ser Asp 
180 185 190 

Asn Ala Pro He Ser Asn Pro Asp Glu Ser Phe Lys Arg Glu Ser Glu 
195 200 205 

Ser Asn Leu Val Ser Thr Ser Gin Leu Ser Leu Met Arg Gin Ala Leu 
210 215 220 

Lys Gin Leu lie Asp Pro Asn Leu Ala Thr Lys Met Pro Lys Gin Val 
225 230 235 240 

Pro Asp Asp Phe Gin Leu Ser lie Phe Glu Ser Glu Asp Lys Gly Asp 
245 250 255 
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Lys Tyr Asn Lys Gly Tyr Gin Asn Leu Leu Ser Gly Gly Leu Val Pro 
260 265 270 



Thr Lys Pro Pro Thr Pro Gly Asp Pro Pro Met Pro Pro Asn Gin Pro 
275 280 285 

Gin Thr Thr Ser Val Leu He Arg Lys Tyr Ala He Gly Asp Tyr Ser 
290 295 300 

Lys Leu Leu Glu Gly Ala Thr Leu Gin Leu Thr Gly Asp Asn Val Asn 
305 310 315 320 

Ser Phe Gin Ala Arg Val Phe Ser Ser Asn Asp He Gly Glu Arg He 
325 330 335 

Glu Leu Ser Asp Gly Thr Tyr Thr Leu Thr Glu Leu Asn Ser Pro Ala 
340 345 350 

Gly Tyr Ser He Ala Glu Pro He Thr Phe Lys Val Glu Ala Gly Lys 
355 360 365 

Val Tyr Thr He He Asp Gly Lys Gin He Glu Asn Pro Asn Lys Glu 
370 375 380 

He Val Glu Pro Tyr Ser Val Glu Ala Tyr Asn Asp Phe Glu Glu Phe 
385 390 395 400 



Ser Val Leu Thr Thr Gin Asn Tyr Ala Lys Phe Tyr Tyr Ala Lys Asn 
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405 



410 



415 



Lys Asn Gly Ser Ser Gin Val Val Tyr Cys Phe Asn Ala Asp Leu Lys 
420 425 430 

Ser Pro Pro Asp Ser Glu Asp Gly Gly Lys Thr Met Thr Pro Asp Phe 
435 440 445 

Thr Thr Gly Glu Val Lys Tyr Thr His He Ala Gly Arg Asp Leu Phe 
450 455 460 

Lys Tyr Thr Val Lys Pro Arg Asp Thr Asp Pro Asp Thr Phe Leu Lys 
465 470 475 480 

His He Lys Lys Val He Glu Lys Gly Tyr Arg Glu Lys Gly Gin Ala 
485 490 495 

He Glu Tyr Ser Gly Leu Thr Glu Thr Gin Leu Arg Ala Ala Thr Gin 
500 505 510 

Leu Ala He Tyr Tyr Phe Thr Asp Ser Ala Glu Leu Asp Lys Asp Lys 
515 520 525 

Leu Lys Asp Tyr His Gly Phe Gly Asp Met Asn Asp Ser Thr Leu Ala 
530 535 540 

Val Ala Lys He Leu Val Glu Tyr Ala Gin Asp Ser Asn Pro Pro Gin 
545 550 555 560 
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Leu Thr Asp Leu Asp Phe Phe He Pro Asn Asn Asn Lys Tyr Gin Ser 
565 570 575 



Leu He Gly Thr Gin Trp His Pro Glu Asp Leu Val Asp He He Arg 
580 585 590 

Met Glu Asp Lys Lys Glu Val lie Pro Val Thr His Asn Leu Thr Leu 
595 600 605 

Arg Lys Thr Val Thr Gly Leu Ala Gly Asp Arg Thr Lys Asp Phe His 
610 615 620 

Phe Glu He Glu Leu Lys Asn Asn Lys Gin Glu Leu Leu Ser Gin Thr 
625 630 635 640 

Val Lys Thr Asp Lys Thr Asn Leu Glu Phe Lys Asp Gly Lys Ala Thr 
645 650 655 

He Asn Leu Lys His Gly Glu Ser Leu Thr Leu Gin Gly Leu Pro Glu 
660 665 670 

Gly Tyr Ser Tyr Leu Val Lys Glu Thr Asp Ser Glu Gly Tyr Lys Val 
675 680 685 

Lys Val Asn Ser Gin Glu Val Ala Asn Ala Thr Val Ser Lys Thr Gly 
690 695 700 

62 



He Thr Ser Asp Glu Thr Leu Ala Phe Glu Asn Asn Lys Glu Pro Val 
705 710 715 720 



Val Pro Thr Gly Val Asp Gin Lys He Asn Gly Tyr Leu Ala Leu He 
725 730 735 

Val He Ala Gly He Ser Leu Gly He Trp Gly He His Thr He Arg 
740 745 750 

He Arg Lys His Asp 
755 

<210> 3 
<211> 2229 
<212> DNA 

<213> Streptococcus pyogenes 
<400> 3 

ttgcaaaaga gggataaaac caattatgga agcgctaaca acaaacgacg acaaacgacg 60 
atcggattac tgaaagtatt tttgacgttt gtagctctga taggaatagt agggttttct 120 
atcagagcgt tcggagctga agaacaatca gtaccaaata gacaaagctc aattcaagat 180 
tatccgtggt atggctatga ttcttatcct aaaggctacc cagactatag tccgttaaag 240 
acttaccata atttaaaagt aaatttagag ggaagtaagg attatcaagc atactgcttt 300 
aatttaacaa aacattttcc atccaagtca gatagtgtta gatcacaatg gtataaaaaa 360 
cttgaaggaa ctaatgaaaa ctttatcaag ttagcagata aaccaagaat agaagacgga 420 
cagttacaac aaaatatatt gaggattctc tataatggat atcctaataa tcgtaatggg 480 
ataatgaaag ggatagatcc tctaaacgct attttagtga ctcaaaatgc tatttggtat 540 
actgattcag ctcaaattaa tccggatgaa agttttaaaa cagaagctcg aagtaatggt 600 
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attaatgacc 


agcagttagg 


cttaatgcga 


ttagggtcaa 


aatattcgaa 


taaaactcca 


catgataagc 


ctttccaaaa 


tcttttgagt 


ccaggagaag 


agcctccggc 


taaaactgaa 


gaaggtgact 


ctaaacttct 


agagggagca 


ggttttcaag 


aaaaagactt 


tcaaagtaat 


gggacttata 


ccttaacaga 


aacatcatct 


aagtttagag 


tagagaataa 


aaaagtattt 


aatccaaaca 


aagaagtagc 


agagccatac 


gaagaagtac 


tctcgggttt 


tactccatac 


aaaagttcac 


aagttgtcta 


ctgcttcaat 


gatagtggtg 


agactataaa 


tccagatact 


acggcaggta 


gtgacttgtt 


taaatatgcg 


ttcttaaagc 


acattaaaaa 


agtaattgaa 


aatggattaa 


cagaaacaca 


gtttcgcgcg 


gacagtgctg 


acttaaaaac 


cttaaaaact 


gaatctatgg 


atgaaaaaac 


cctagctgtc 


ggcagtgccc 


ctcaactaac 


aaatcttgat 


tctcttattg 


ggacagaatg 


ccatccagat 


aaaaagcaag 


aagttattcc 


agtaactcac 


gagttgggag 


ataaaactaa 


aggcttccaa 


cagcctattg 


ttaacactct 


aaaaactaat 


tattcattta 


atctaaagca 


tggtgacacc 


tcttatactc 


tgaaagaggc 


tgaagctaag 


agtcaagaag 


cgcagtcagt 


aggtaaggat 


aaccgaaaag 


atcttgtccc 


accaactggt 


ttgttattac 


ttgttccact 


tgggttattg 


aatgactaa 







aaagctttaa 


aagaactaat 


tgatccaaac 


660 


tcaggttatc 


ggttaaatgt 


atttgaatct 


720 


gctgagtatg 


ttccggatac 


tcccccaaaa 


780 


aaaacatcag 


tcattatcag 


aaaatatgcg 


840 


accttaaagc 


tttctcaaat 


tgaaggaagt 


900 


agtttaggag 


aaactgtcga 


attaccaaat 


960 


ccagatggat 


ataaaattgc 


ggagccgatt 


1020 


atcgtccaaa 


aagatggttc 


tcaagtggaa 


1 A A A 

1080 


tcagtggaag 


cgtataatga 


ctttatggat 


1140 


ggaaaattct 


attacgctac 


aaataaggat 


1200 


gctgatttac 


actcaccacc 


tgactcatat 


1260 


agtacgatga 


aagaagtcaa 


gtacacacat 


1320 


ctaagaccga 


gagatacaaa 


tccagaagac 


1380 


aaaggctaca 


agaaaaaagg 


tgatagctat 


1440 


gctactcagc 


ttgctatcta 


ttattttaca 


1 C A A 

1500 


tataacaatg 


ggaaaggtta 


ccatggattt 


1560 


acaaaagaat 


taattactta 


tgctcaaaat 


1620 


ttcttcgtac 


ctaataatag 


caaagaccaa 


1680 


gatttggttg 


acgtgattcg 


tatggaagat 


1740 


agtttgacag 


tgaaaaaaac 


agtagtcggt 


1800 


tttgaacttg 


agttgaaaga 


taaaactgga 


1860 


aatcaagatt 


tagtagctaa 


agatgggaaa 


1920 


ataagaatag 


aaggattacc 


gacgggatat 


1980 


gattatatag 


taaccgttga 


taacaaagtt 


2040 


ataacagaag 


acaaaaaagt 


cacttttgaa 


2100 


ttgacaacag 


atggggctat 


ctatctttgg 


2160 


gtttggctat 


ttggtcgtaa 


agggttaaaa 


2220 
2229 
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<210> 4 
<211> 742 
<212> PRT 

<213> Streptococcus pyogenes 
<400> 4 

Met Gin Lys Arg Asp Lys Thr Asn Tyr Gly Ser Ala Asn Asn Lys Arg 
15 10 15 

Arg Gin Thr Thr lie Gly Leu Leu Lys Val Phe Leu Thr Phe Val Ala 
20 25 30 

Leu lie Gly lie Val Gly Phe Ser lie Arg Ala Phe Gly Ala Glu Glu 
35 40 45 

Gin Ser Val Pro Asn Arg Gin Ser Ser He Gin Asp Tyr Pro Trp Tyr 
50 55 60 

Gly Tyr Asp Ser Tyr Pro Lys Gly Tyr Pro Asp Tyr Ser Pro Leu Lys 
65 70 75 80 

Thr Tyr His Asn Leu Lys Val Asn Leu Glu Gly Ser Lys Asp Tyr Gin 
85 90 95 

Ala Tyr Cys Phe Asn Leu Thr Lys His Phe Pro Ser Lys Ser Asp Ser 
100 105 110 
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Val Arg Ser Gin Trp Tyr Lys Lys Leu Glu Gly Thr Asn Glu Asn Phe 
115 120 125 



lie Lys Leu Ala Asp Lys Pro Arg lie Glu Asp Gly Gin Leu Gin Gin 
130 135 140 

Asn lie Leu Arg He Leu Tyr Asn Gly Tyr Pro Asn Asn Arg Asn Gly 
145 150 155 160 

He Met Lys Gly He Asp Pro Leu Asn Ala He Leu Val Thr Gin Asn 

165 170 175 

Ala He Trp Tyr Thr Asp Ser Ala Gin He Asn Pro Asp Glu Ser Phe 
180 185 190 

Lys Thr Glu Ala Arg Ser Asn Gly He Asn Asp Gin Gin Leu Gly Leu 

195 200 205 

Met Arg Lys Ala Leu Lys Glu Leu He Asp Pro Asn Leu Gly Ser Lys 
210 215 220 

Tyr Ser Asn Lys Thr Pro Ser Gly Tyr Arg Leu Asn Val Phe Glu Ser 
225 230 235 240 

His Asp Lys Pro Phe Gin Asn Leu Leu Ser Ala Glu Tyr Val Pro Asp 
245 250 255 

Thr Pro Pro Lys Pro Gly Glu Glu Pro Pro Ala Lys Thr Glu Lys Thr 
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260 



265 



270 



Ser Val lie lie Arg Lys Tyr Ala Glu Gly Asp Ser Lys Leu Leu Glu 
275 280 285 

Gly Ala Thr Leu Lys Leu Ser Gin lie Glu Gly Ser Gly Phe Gin Glu 
290 295 300 

Lys Asp Phe Gin Ser Asn Ser Leu Gly Glu Thr Val Glu Leu Pro Asn 
305 310 315 320 

Gly Thr Tyr Thr Leu Thr Glu Thr Ser Ser Pro Asp Gly Tyr Lys lie 
325 330 335 

Ala Glu Pro He Lys Phe Arg Val Glu Asn Lys Lys Val Phe He Val 
340 345 350 

Gin Lys Asp Gly Ser Gin Val Glu Asn Pro Asn Lys Glu Val Ala Glu 
355 360 365 

Pro Tyr Ser Val Glu Ala Tyr Asn Asp Phe Met Asp Glu Glu Val Leu 
370 375 380 

Ser Gly Phe Thr Pro Tyr Gly Lys Phe Tyr Tyr Ala Thr Asn Lys Asp 
385 390 395 400 

Lys Ser Ser Gin Val Val Tyr Cys Phe Asn Ala Asp Leu His Ser Pro 
405 410 415 
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Pro Asp Ser Tyr Asp Ser Gly Glu Thr He Asn Pro Asp Thr Ser Thr 
420 425 430 



Met Lys Glu Val Lys Tyr Thr His Thr Ala Gly Ser Asp Leu Phe Lys 
435 440 445 

Tyr Ala Leu Arg Pro Arg Asp Thr Asn Pro Glu Asp Phe Leu Lys His 
450 455 460 

He Lys Lys Val He Glu Lys Gly Tyr Lys Lys Lys Gly Asp Ser Tyr 
465 470 475 480 

Asn Gly Leu Thr Glu Thr Gin Phe Arg Ala Ala Thr Gin Leu Ala He 
485 490 495 

Tyr Tyr Phe Thr Asp Ser Ala Asp Leu Lys Thr Leu Lys Thr Tyr Asn 
500 505 510 

Asn Gly Lys Gly Tyr His Gly Phe Glu Ser Met Asp Glu Lys Thr Leu 
515 520 525 

Ala Val Thr Lys Glu Leu He Thr Tyr Ala Gin Asn Gly Ser Ala Pro 
530 535 540 

Gin Leu Thr Asn Leu Asp Phe Phe Val Pro Asn Asn Ser Lys Asp Gin 
545 550 555 560 
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Ser Leu lie Gly Thr Glu Cys His Pro Asp Asp Leu Val Asp Val He 
565 570 575 



Arg Met Glu Asp Lys Lys Gin Glu Val He Pro Val Thr His Ser Leu 
580 585 590 

Thr Val Lys Lys Thr Val Val Asp Glu Leu Gly Asp Lys Thr Lys Gly 
595 600 605 

Phe Gin Phe Glu Leu Glu Leu Lys Asp Lys Thr Gly Gin Pro He Val 
610 615 620 

Asn Thr Leu Lys Thr Asn Asn Gin Asp Leu Val Ala Lys Asp Gly Lys 
625 630 635 640 

Tyr Ser Phe Asn Leu Lys His Gly Asp Thr He Arg He Glu Gly Leu 
645 650 655 

Pro Thr Gly Tyr Ser Tyr Thr Leu Lys Glu Ala Glu Ala Lys Asp Tyr 
660 665 670 

He Val Thr Val Asp Asn Lys Val Ser Gin Glu Ala Gin Ser Val Gly 
675 680 685 

Lys Asp He Thr Glu Asp Lys Lys Val Thr Phe Glu Asn Arg Lys Asp 
690 695 700 

Leu Val Pro Pro Thr Gly Leu Thr Thr Asp Gly Ala He Tyr Leu Trp 
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705 



710 



715 



720 



□ 

q 

y 

nJ 
a 

H 

□ 
UJ 

□ 



Leu Leu Leu Leu Val Pro Leu Gly Leu Leu Val Trp Leu Phe Gly Arg 
725 730 735 

Lys Gly Leu Lys Asn Asp 
740 
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What Is Claimed Is: 

1. An isolated nucleic acid molecule encoding a collagen-binding protein, wherein 
the collagen-binding protein is isolated from group A Streptococcus bacteria. 

2. The isolated nucleic acid molecule of Claim 1 , wherein the protein is encoded by 
an amino acid sequence selected from the group consisting of SEQ ID NO. 2 and SEQ 
ID No.4. 

3. The isolated nucleic acid of Claim 1, comprising a sequence selected from the 
group consisting of SEQ ID NO. 1 and SEQ ID NO. 3. 

4. The isolated nucleic acid of Claim 1, comprising a sequence that selectively 
hybridizes to a sequence selected from the group consisting of SEQ ID NO. 1 and SEQ 

f lj ID NO. 3. 

^ 5. An isolated collagen-binding protein from group A streptococci selected from the 
* group consisting of Cpa1 and Cpa49. 

; f 6. An isolated collagen-binding protein from group A streptococci having a selected 
from the group consisting of SEQ ID NO. 2 and SEQ ID NO. 4. 

7. An isolated collagen-binding protein according to Claim 6 wherein the protein is 
isolated from Streptococcus pyogenes. 

8. Antibody or antisera raised against the protein of Claim 5. 
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9. A diagnostic kit for determining the presence of Cpa1 or Cpa49 proteins 
comprising a protein according to Claim 5 and means to introduce the protein to a 
sample. 

10. A diagnostic kit for determining the presence of Cpa1 or Cpa49 proteins 
comprising antibodies reactive with either Cpa1 or Cpa 49 and means to introduce the 
antibodies to a sample. 

11. A pharmaceutical composition for treating or preventing a group A streptococcal 
infection comprising the protein of Claim 5 and a pharmaceutical^ acceptable carrier or 
excipient. 

12. A pharmaceutical composition for treating or preventing a group A streptococcal 
infection comprising antisera or an antibody according to Claim 5 and a 
pharmaceutically acceptable carrier or excipient. 

1 3. A method of treating or preventing a group A streptococcal infection in a patient 
comprising administering to the patient an isolated protein selected from the group 
consisting of Cpa1, Cpa49 and active fragments thereof in an amount sufficient to 
inhibit binding of group A streptococci to collagen. 

1 4. The method of Claim 1 3, wherein the infection treated is selected from the group 
consisting of skin and mucous membrane infections, connective tissue infections and 
septicemia. 

15. A method of treating or preventing a group A streptococcal infection in a patient 
comprising administering to the patient antibodies to a protein selected from the group 
consisting of Cpa1, Cpa49 and active fragments thereof in an amount sufficient to 
inhibit binding of group A streptococci to collagen. 



72 



16. A method of reducing group A streptococci infection of an indwelling medical 
device comprising coating the medical device with a composition comprising a protein 
selected from the group consisting of Cpa1 , Cpa49 and active fragments thereof. 

17. The method of Claim 16 wherein the medical device is selected from the group 
consisting of vascular grafts, vascular stents, intravenous catheters, artificial heart 
valves, and cardiac assist devices. 

18. A method of inducing an immunological response comprising administering to a 
patient a composition comprising an isolated protein selected from the group consisting 
of Cpa1 , Cpa49 and active fragments thereof. 
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Abstract of the Disclosure 

Isolated proteins, designated Cpa1 and Cpa49, and their corresponding amino 
acid and nucleic acid sequences are provided which are useful in the prevention and 
treatment of infection caused by group A streptococcal bacteria such as Streptococcus 
pyogenes. These proteins have been observed to bind to collagen, and thus methods 
are provided, such as by administration of the proteins or antibodies generated thereto, 
whereby streptococcal binding of collagen can be inhibited, and streptococcal infection 
can be greatly reduced. In addition, medical instruments can be treated using the 
collagen-binding proteins of the invention in order to reduce or eliminate the possibility 
of their becoming infected or further spreading the infection. In particular, the proteins 
are advantageous because they may be used as vaccine components or antibodies 
thereof, and they may be administered to wounds or used to coat biomaterials in order 
to act as collagen blocking agents and reduce or prevent severe infection by group A 
streptococcal bacteria. 
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Transcript analysis of selected genes 
in GAS wt and nra mutant strains 
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Transcript analysis of regulatory genes 
in GAS wt, mga and nra mutant strains 
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Binding to immobilized human matrix 
proteins by GAS wt and nra-mutant 
strains grown in / on THY- medium 
to logarithmic / stationary growth phase 
in an aerobic / anaerobic atmosphere 
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